[rt] Update to 4.9.18-rt14

This commit is contained in:
Ben Hutchings 2017-03-29 04:57:14 +01:00
parent 5c465af56e
commit 9984b67924
310 changed files with 2984 additions and 618 deletions

18
debian/changelog vendored
View File

@ -118,7 +118,23 @@ linux (4.9.18-1) UNRELEASED; urgency=medium
and IOMMU setup
* Ignore ABI changes in bpf, dccp, libiscsi
* [x86] Ignore ABI changes in kvm
* [rt] Update to 4.9.18-rt13 (no functional change)
* [rt] Update to 4.9.18-rt14:
- lockdep: Fix per-cpu static objects
- futex: Cleanup variable names for futex_top_waiter()
- futex: Use smp_store_release() in mark_wake_futex()
- futex: Remove rt_mutex_deadlock_account_*()
- futex,rt_mutex: Provide futex specific rt_mutex API
- futex: Change locking rules
- futex: Cleanup refcounting
- futex: Rework inconsistent rt_mutex/futex_q state
- futex: Pull rt_mutex_futex_unlock() out from under hb->lock
- futex,rt_mutex: Introduce rt_mutex_init_waiter()
- futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
- futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
- futex: Futex_unlock_pi() determinism
- futex: Drop hb->lock before enqueueing on the rtmutex
- futex: workaround migrate_disable/enable in different context
- Revert "kernel/futex: don't deboost too early"
-- Ben Hutchings <ben@decadent.org.uk> Mon, 27 Mar 2017 21:54:36 +0100

View File

@ -0,0 +1,118 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:48 +0100
Subject: [PATCH] futex: Cleanup variable names for futex_top_waiter()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 499f5aca2cdd5e958b27e2655e7e7f82524f46b1
futex_top_waiter() returns the top-waiter on the pi_mutex. Assinging
this to a variable 'match' totally obscures the code.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.554710645@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1120,14 +1120,14 @@ static int attach_to_pi_owner(u32 uval,
static int lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
union futex_key *key, struct futex_pi_state **ps)
{
- struct futex_q *match = futex_top_waiter(hb, key);
+ struct futex_q *top_waiter = futex_top_waiter(hb, key);
/*
* If there is a waiter on that futex, validate it and
* attach to the pi_state when the validation succeeds.
*/
- if (match)
- return attach_to_pi_state(uval, match->pi_state, ps);
+ if (top_waiter)
+ return attach_to_pi_state(uval, top_waiter->pi_state, ps);
/*
* We are the first waiter - try to look up the owner based on
@@ -1174,7 +1174,7 @@ static int futex_lock_pi_atomic(u32 __us
struct task_struct *task, int set_waiters)
{
u32 uval, newval, vpid = task_pid_vnr(task);
- struct futex_q *match;
+ struct futex_q *top_waiter;
int ret;
/*
@@ -1200,9 +1200,9 @@ static int futex_lock_pi_atomic(u32 __us
* Lookup existing state first. If it exists, try to attach to
* its pi_state.
*/
- match = futex_top_waiter(hb, key);
- if (match)
- return attach_to_pi_state(uval, match->pi_state, ps);
+ top_waiter = futex_top_waiter(hb, key);
+ if (top_waiter)
+ return attach_to_pi_state(uval, top_waiter->pi_state, ps);
/*
* No waiter and user TID is 0. We are here because the
@@ -1292,11 +1292,11 @@ static void mark_wake_futex(struct wake_
q->lock_ptr = NULL;
}
-static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this,
+static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *top_waiter,
struct futex_hash_bucket *hb)
{
struct task_struct *new_owner;
- struct futex_pi_state *pi_state = this->pi_state;
+ struct futex_pi_state *pi_state = top_waiter->pi_state;
u32 uninitialized_var(curval), newval;
WAKE_Q(wake_q);
bool deboost;
@@ -1317,11 +1317,11 @@ static int wake_futex_pi(u32 __user *uad
/*
* It is possible that the next waiter (the one that brought
- * this owner to the kernel) timed out and is no longer
+ * top_waiter owner to the kernel) timed out and is no longer
* waiting on the lock.
*/
if (!new_owner)
- new_owner = this->task;
+ new_owner = top_waiter->task;
/*
* We pass it to the next owner. The WAITERS bit is always
@@ -2631,7 +2631,7 @@ static int futex_unlock_pi(u32 __user *u
u32 uninitialized_var(curval), uval, vpid = task_pid_vnr(current);
union futex_key key = FUTEX_KEY_INIT;
struct futex_hash_bucket *hb;
- struct futex_q *match;
+ struct futex_q *top_waiter;
int ret;
retry:
@@ -2655,9 +2655,9 @@ static int futex_unlock_pi(u32 __user *u
* all and we at least want to know if user space fiddled
* with the futex value instead of blindly unlocking.
*/
- match = futex_top_waiter(hb, &key);
- if (match) {
- ret = wake_futex_pi(uaddr, uval, match, hb);
+ top_waiter = futex_top_waiter(hb, &key);
+ if (top_waiter) {
+ ret = wake_futex_pi(uaddr, uval, top_waiter, hb);
/*
* In case of success wake_futex_pi dropped the hash
* bucket lock.

View File

@ -0,0 +1,39 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:49 +0100
Subject: [PATCH] futex: Use smp_store_release() in mark_wake_futex()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 1b367ece0d7e696cab1c8501bab282cc6a538b3f
Since the futex_q can dissapear the instruction after assigning NULL,
this really should be a RELEASE barrier. That stops loads from hitting
dead memory too.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.604296452@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1288,8 +1288,7 @@ static void mark_wake_futex(struct wake_
* memory barrier is required here to prevent the following
* store to lock_ptr from getting ahead of the plist_del.
*/
- smp_wmb();
- q->lock_ptr = NULL;
+ smp_store_release(&q->lock_ptr, NULL);
}
static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *top_waiter,

View File

@ -0,0 +1,185 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:50 +0100
Subject: [PATCH] futex: Remove rt_mutex_deadlock_account_*()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit fffa954fb528963c2fb7b0c0084eb77e2be7ab52
These are unused and clutter up the code.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.652692478@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/locking/rtmutex-debug.c | 9 -------
kernel/locking/rtmutex-debug.h | 3 --
kernel/locking/rtmutex.c | 47 +++++++++++++++--------------------------
kernel/locking/rtmutex.h | 2 -
4 files changed, 18 insertions(+), 43 deletions(-)
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -173,12 +173,3 @@ void debug_rt_mutex_init(struct rt_mutex
lock->name = name;
}
-void
-rt_mutex_deadlock_account_lock(struct rt_mutex *lock, struct task_struct *task)
-{
-}
-
-void rt_mutex_deadlock_account_unlock(struct task_struct *task)
-{
-}
-
--- a/kernel/locking/rtmutex-debug.h
+++ b/kernel/locking/rtmutex-debug.h
@@ -9,9 +9,6 @@
* This file contains macros used solely by rtmutex.c. Debug version.
*/
-extern void
-rt_mutex_deadlock_account_lock(struct rt_mutex *lock, struct task_struct *task);
-extern void rt_mutex_deadlock_account_unlock(struct task_struct *task);
extern void debug_rt_mutex_init_waiter(struct rt_mutex_waiter *waiter);
extern void debug_rt_mutex_free_waiter(struct rt_mutex_waiter *waiter);
extern void debug_rt_mutex_init(struct rt_mutex *lock, const char *name);
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -936,8 +936,6 @@ static int try_to_take_rt_mutex(struct r
*/
rt_mutex_set_owner(lock, task);
- rt_mutex_deadlock_account_lock(lock, task);
-
return 1;
}
@@ -1340,8 +1338,6 @@ static bool __sched rt_mutex_slowunlock(
debug_rt_mutex_unlock(lock);
- rt_mutex_deadlock_account_unlock(current);
-
/*
* We must be careful here if the fast path is enabled. If we
* have no waiters queued we cannot set owner to NULL here
@@ -1407,11 +1403,10 @@ rt_mutex_fastlock(struct rt_mutex *lock,
struct hrtimer_sleeper *timeout,
enum rtmutex_chainwalk chwalk))
{
- if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) {
- rt_mutex_deadlock_account_lock(lock, current);
+ if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
return 0;
- } else
- return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK);
+
+ return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK);
}
static inline int
@@ -1423,21 +1418,19 @@ rt_mutex_timed_fastlock(struct rt_mutex
enum rtmutex_chainwalk chwalk))
{
if (chwalk == RT_MUTEX_MIN_CHAINWALK &&
- likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) {
- rt_mutex_deadlock_account_lock(lock, current);
+ likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
return 0;
- } else
- return slowfn(lock, state, timeout, chwalk);
+
+ return slowfn(lock, state, timeout, chwalk);
}
static inline int
rt_mutex_fasttrylock(struct rt_mutex *lock,
int (*slowfn)(struct rt_mutex *lock))
{
- if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) {
- rt_mutex_deadlock_account_lock(lock, current);
+ if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
return 1;
- }
+
return slowfn(lock);
}
@@ -1447,19 +1440,18 @@ rt_mutex_fastunlock(struct rt_mutex *loc
struct wake_q_head *wqh))
{
WAKE_Q(wake_q);
+ bool deboost;
- if (likely(rt_mutex_cmpxchg_release(lock, current, NULL))) {
- rt_mutex_deadlock_account_unlock(current);
+ if (likely(rt_mutex_cmpxchg_release(lock, current, NULL)))
+ return;
- } else {
- bool deboost = slowfn(lock, &wake_q);
+ deboost = slowfn(lock, &wake_q);
- wake_up_q(&wake_q);
+ wake_up_q(&wake_q);
- /* Undo pi boosting if necessary: */
- if (deboost)
- rt_mutex_adjust_prio(current);
- }
+ /* Undo pi boosting if necessary: */
+ if (deboost)
+ rt_mutex_adjust_prio(current);
}
/**
@@ -1570,10 +1562,9 @@ EXPORT_SYMBOL_GPL(rt_mutex_unlock);
bool __sched rt_mutex_futex_unlock(struct rt_mutex *lock,
struct wake_q_head *wqh)
{
- if (likely(rt_mutex_cmpxchg_release(lock, current, NULL))) {
- rt_mutex_deadlock_account_unlock(current);
+ if (likely(rt_mutex_cmpxchg_release(lock, current, NULL)))
return false;
- }
+
return rt_mutex_slowunlock(lock, wqh);
}
@@ -1631,7 +1622,6 @@ void rt_mutex_init_proxy_locked(struct r
__rt_mutex_init(lock, NULL);
debug_rt_mutex_proxy_lock(lock, proxy_owner);
rt_mutex_set_owner(lock, proxy_owner);
- rt_mutex_deadlock_account_lock(lock, proxy_owner);
}
/**
@@ -1647,7 +1637,6 @@ void rt_mutex_proxy_unlock(struct rt_mut
{
debug_rt_mutex_proxy_unlock(lock);
rt_mutex_set_owner(lock, NULL);
- rt_mutex_deadlock_account_unlock(proxy_owner);
}
/**
--- a/kernel/locking/rtmutex.h
+++ b/kernel/locking/rtmutex.h
@@ -11,8 +11,6 @@
*/
#define rt_mutex_deadlock_check(l) (0)
-#define rt_mutex_deadlock_account_lock(m, t) do { } while (0)
-#define rt_mutex_deadlock_account_unlock(l) do { } while (0)
#define debug_rt_mutex_init_waiter(w) do { } while (0)
#define debug_rt_mutex_free_waiter(w) do { } while (0)
#define debug_rt_mutex_lock(l) do { } while (0)

View File

@ -0,0 +1,221 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:51 +0100
Subject: [PATCH] futex,rt_mutex: Provide futex specific rt_mutex API
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 5293c2efda37775346885c7e924d4ef7018ea60b
Part of what makes futex_unlock_pi() intricate is that
rt_mutex_futex_unlock() -> rt_mutex_slowunlock() can drop
rt_mutex::wait_lock.
This means it cannot rely on the atomicy of wait_lock, which would be
preferred in order to not rely on hb->lock so much.
The reason rt_mutex_slowunlock() needs to drop wait_lock is because it can
race with the rt_mutex fastpath, however futexes have their own fast path.
Since futexes already have a bunch of separate rt_mutex accessors, complete
that set and implement a rt_mutex variant without fastpath for them.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.702962446@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 30 ++++++++++-----------
kernel/locking/rtmutex.c | 55 +++++++++++++++++++++++++++++-----------
kernel/locking/rtmutex_common.h | 9 +++++-
3 files changed, 62 insertions(+), 32 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -914,7 +914,7 @@ void exit_pi_state_list(struct task_stru
pi_state->owner = NULL;
raw_spin_unlock_irq(&curr->pi_lock);
- rt_mutex_unlock(&pi_state->pi_mutex);
+ rt_mutex_futex_unlock(&pi_state->pi_mutex);
spin_unlock(&hb->lock);
@@ -1362,20 +1362,18 @@ static int wake_futex_pi(u32 __user *uad
pi_state->owner = new_owner;
raw_spin_unlock(&new_owner->pi_lock);
- raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
-
- deboost = rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q);
-
/*
- * First unlock HB so the waiter does not spin on it once he got woken
- * up. Second wake up the waiter before the priority is adjusted. If we
- * deboost first (and lose our higher priority), then the task might get
- * scheduled away before the wake up can take place.
+ * We've updated the uservalue, this unlock cannot fail.
*/
+ deboost = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q);
+
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
spin_unlock(&hb->lock);
- wake_up_q(&wake_q);
- if (deboost)
+
+ if (deboost) {
+ wake_up_q(&wake_q);
rt_mutex_adjust_prio(current);
+ }
return 0;
}
@@ -2251,7 +2249,7 @@ static int fixup_owner(u32 __user *uaddr
* task acquired the rt_mutex after we removed ourself from the
* rt_mutex waiters list.
*/
- if (rt_mutex_trylock(&q->pi_state->pi_mutex)) {
+ if (rt_mutex_futex_trylock(&q->pi_state->pi_mutex)) {
locked = 1;
goto out;
}
@@ -2566,7 +2564,7 @@ static int futex_lock_pi(u32 __user *uad
if (!trylock) {
ret = rt_mutex_timed_futex_lock(&q.pi_state->pi_mutex, to);
} else {
- ret = rt_mutex_trylock(&q.pi_state->pi_mutex);
+ ret = rt_mutex_futex_trylock(&q.pi_state->pi_mutex);
/* Fixup the trylock return value: */
ret = ret ? 0 : -EWOULDBLOCK;
}
@@ -2589,7 +2587,7 @@ static int futex_lock_pi(u32 __user *uad
* it and return the fault to userspace.
*/
if (ret && (rt_mutex_owner(&q.pi_state->pi_mutex) == current))
- rt_mutex_unlock(&q.pi_state->pi_mutex);
+ rt_mutex_futex_unlock(&q.pi_state->pi_mutex);
/* Unqueue and drop the lock */
unqueue_me_pi(&q);
@@ -2896,7 +2894,7 @@ static int futex_wait_requeue_pi(u32 __u
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, &q, current);
if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current)
- rt_mutex_unlock(&q.pi_state->pi_mutex);
+ rt_mutex_futex_unlock(&q.pi_state->pi_mutex);
/*
* Drop the reference to the pi state which
* the requeue_pi() code acquired for us.
@@ -2936,7 +2934,7 @@ static int futex_wait_requeue_pi(u32 __u
* userspace.
*/
if (ret && rt_mutex_owner(pi_mutex) == current)
- rt_mutex_unlock(pi_mutex);
+ rt_mutex_futex_unlock(pi_mutex);
/* Unqueue and drop the lock. */
unqueue_me_pi(&q);
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1486,15 +1486,23 @@ EXPORT_SYMBOL_GPL(rt_mutex_lock_interrup
/*
* Futex variant with full deadlock detection.
+ * Futex variants must not use the fast-path, see __rt_mutex_futex_unlock().
*/
-int rt_mutex_timed_futex_lock(struct rt_mutex *lock,
+int __sched rt_mutex_timed_futex_lock(struct rt_mutex *lock,
struct hrtimer_sleeper *timeout)
{
might_sleep();
- return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
- RT_MUTEX_FULL_CHAINWALK,
- rt_mutex_slowlock);
+ return rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE,
+ timeout, RT_MUTEX_FULL_CHAINWALK);
+}
+
+/*
+ * Futex variant, must not use fastpath.
+ */
+int __sched rt_mutex_futex_trylock(struct rt_mutex *lock)
+{
+ return rt_mutex_slowtrylock(lock);
}
/**
@@ -1553,19 +1561,38 @@ void __sched rt_mutex_unlock(struct rt_m
EXPORT_SYMBOL_GPL(rt_mutex_unlock);
/**
- * rt_mutex_futex_unlock - Futex variant of rt_mutex_unlock
- * @lock: the rt_mutex to be unlocked
- *
- * Returns: true/false indicating whether priority adjustment is
- * required or not.
+ * Futex variant, that since futex variants do not use the fast-path, can be
+ * simple and will not need to retry.
*/
-bool __sched rt_mutex_futex_unlock(struct rt_mutex *lock,
- struct wake_q_head *wqh)
+bool __sched __rt_mutex_futex_unlock(struct rt_mutex *lock,
+ struct wake_q_head *wake_q)
+{
+ lockdep_assert_held(&lock->wait_lock);
+
+ debug_rt_mutex_unlock(lock);
+
+ if (!rt_mutex_has_waiters(lock)) {
+ lock->owner = NULL;
+ return false; /* done */
+ }
+
+ mark_wakeup_next_waiter(wake_q, lock);
+ return true; /* deboost and wakeups */
+}
+
+void __sched rt_mutex_futex_unlock(struct rt_mutex *lock)
{
- if (likely(rt_mutex_cmpxchg_release(lock, current, NULL)))
- return false;
+ WAKE_Q(wake_q);
+ bool deboost;
- return rt_mutex_slowunlock(lock, wqh);
+ raw_spin_lock_irq(&lock->wait_lock);
+ deboost = __rt_mutex_futex_unlock(lock, &wake_q);
+ raw_spin_unlock_irq(&lock->wait_lock);
+
+ if (deboost) {
+ wake_up_q(&wake_q);
+ rt_mutex_adjust_prio(current);
+ }
}
/**
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -109,9 +109,14 @@ extern int rt_mutex_start_proxy_lock(str
extern int rt_mutex_finish_proxy_lock(struct rt_mutex *lock,
struct hrtimer_sleeper *to,
struct rt_mutex_waiter *waiter);
+
extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct hrtimer_sleeper *to);
-extern bool rt_mutex_futex_unlock(struct rt_mutex *lock,
- struct wake_q_head *wqh);
+extern int rt_mutex_futex_trylock(struct rt_mutex *l);
+
+extern void rt_mutex_futex_unlock(struct rt_mutex *lock);
+extern bool __rt_mutex_futex_unlock(struct rt_mutex *lock,
+ struct wake_q_head *wqh);
+
extern void rt_mutex_adjust_prio(struct task_struct *task);
#ifdef CONFIG_DEBUG_RT_MUTEXES

View File

@ -0,0 +1,371 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:52 +0100
Subject: [PATCH] futex: Change locking rules
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 734009e96d1983ad739e5b656e03430b3660c913
Currently futex-pi relies on hb->lock to serialize everything. But hb->lock
creates another set of problems, especially priority inversions on RT where
hb->lock becomes a rt_mutex itself.
The rt_mutex::wait_lock is the most obvious protection for keeping the
futex user space value and the kernel internal pi_state in sync.
Rework and document the locking so rt_mutex::wait_lock is held accross all
operations which modify the user space value and the pi state.
This allows to invoke rt_mutex_unlock() (including deboost) without holding
hb->lock as a next step.
Nothing yet relies on the new locking rules.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.751993333@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 165 +++++++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 132 insertions(+), 33 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -971,6 +971,39 @@ void exit_pi_state_list(struct task_stru
*
* [10] There is no transient state which leaves owner and user space
* TID out of sync.
+ *
+ *
+ * Serialization and lifetime rules:
+ *
+ * hb->lock:
+ *
+ * hb -> futex_q, relation
+ * futex_q -> pi_state, relation
+ *
+ * (cannot be raw because hb can contain arbitrary amount
+ * of futex_q's)
+ *
+ * pi_mutex->wait_lock:
+ *
+ * {uval, pi_state}
+ *
+ * (and pi_mutex 'obviously')
+ *
+ * p->pi_lock:
+ *
+ * p->pi_state_list -> pi_state->list, relation
+ *
+ * pi_state->refcount:
+ *
+ * pi_state lifetime
+ *
+ *
+ * Lock order:
+ *
+ * hb->lock
+ * pi_mutex->wait_lock
+ * p->pi_lock
+ *
*/
/*
@@ -978,10 +1011,12 @@ void exit_pi_state_list(struct task_stru
* the pi_state against the user space value. If correct, attach to
* it.
*/
-static int attach_to_pi_state(u32 uval, struct futex_pi_state *pi_state,
+static int attach_to_pi_state(u32 __user *uaddr, u32 uval,
+ struct futex_pi_state *pi_state,
struct futex_pi_state **ps)
{
pid_t pid = uval & FUTEX_TID_MASK;
+ int ret, uval2;
/*
* Userspace might have messed up non-PI and PI futexes [3]
@@ -989,9 +1024,34 @@ static int attach_to_pi_state(u32 uval,
if (unlikely(!pi_state))
return -EINVAL;
+ /*
+ * We get here with hb->lock held, and having found a
+ * futex_top_waiter(). This means that futex_lock_pi() of said futex_q
+ * has dropped the hb->lock in between queue_me() and unqueue_me_pi(),
+ * which in turn means that futex_lock_pi() still has a reference on
+ * our pi_state.
+ */
WARN_ON(!atomic_read(&pi_state->refcount));
/*
+ * Now that we have a pi_state, we can acquire wait_lock
+ * and do the state validation.
+ */
+ raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
+
+ /*
+ * Since {uval, pi_state} is serialized by wait_lock, and our current
+ * uval was read without holding it, it can have changed. Verify it
+ * still is what we expect it to be, otherwise retry the entire
+ * operation.
+ */
+ if (get_futex_value_locked(&uval2, uaddr))
+ goto out_efault;
+
+ if (uval != uval2)
+ goto out_eagain;
+
+ /*
* Handle the owner died case:
*/
if (uval & FUTEX_OWNER_DIED) {
@@ -1006,11 +1066,11 @@ static int attach_to_pi_state(u32 uval,
* is not 0. Inconsistent state. [5]
*/
if (pid)
- return -EINVAL;
+ goto out_einval;
/*
* Take a ref on the state and return success. [4]
*/
- goto out_state;
+ goto out_attach;
}
/*
@@ -1022,14 +1082,14 @@ static int attach_to_pi_state(u32 uval,
* Take a ref on the state and return success. [6]
*/
if (!pid)
- goto out_state;
+ goto out_attach;
} else {
/*
* If the owner died bit is not set, then the pi_state
* must have an owner. [7]
*/
if (!pi_state->owner)
- return -EINVAL;
+ goto out_einval;
}
/*
@@ -1038,11 +1098,29 @@ static int attach_to_pi_state(u32 uval,
* user space TID. [9/10]
*/
if (pid != task_pid_vnr(pi_state->owner))
- return -EINVAL;
-out_state:
+ goto out_einval;
+
+out_attach:
atomic_inc(&pi_state->refcount);
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
*ps = pi_state;
return 0;
+
+out_einval:
+ ret = -EINVAL;
+ goto out_error;
+
+out_eagain:
+ ret = -EAGAIN;
+ goto out_error;
+
+out_efault:
+ ret = -EFAULT;
+ goto out_error;
+
+out_error:
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
+ return ret;
}
/*
@@ -1093,6 +1171,9 @@ static int attach_to_pi_owner(u32 uval,
/*
* No existing pi state. First waiter. [2]
+ *
+ * This creates pi_state, we have hb->lock held, this means nothing can
+ * observe this state, wait_lock is irrelevant.
*/
pi_state = alloc_pi_state();
@@ -1117,7 +1198,8 @@ static int attach_to_pi_owner(u32 uval,
return 0;
}
-static int lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
+static int lookup_pi_state(u32 __user *uaddr, u32 uval,
+ struct futex_hash_bucket *hb,
union futex_key *key, struct futex_pi_state **ps)
{
struct futex_q *top_waiter = futex_top_waiter(hb, key);
@@ -1127,7 +1209,7 @@ static int lookup_pi_state(u32 uval, str
* attach to the pi_state when the validation succeeds.
*/
if (top_waiter)
- return attach_to_pi_state(uval, top_waiter->pi_state, ps);
+ return attach_to_pi_state(uaddr, uval, top_waiter->pi_state, ps);
/*
* We are the first waiter - try to look up the owner based on
@@ -1146,7 +1228,7 @@ static int lock_pi_update_atomic(u32 __u
if (unlikely(cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)))
return -EFAULT;
- /*If user space value changed, let the caller retry */
+ /* If user space value changed, let the caller retry */
return curval != uval ? -EAGAIN : 0;
}
@@ -1202,7 +1284,7 @@ static int futex_lock_pi_atomic(u32 __us
*/
top_waiter = futex_top_waiter(hb, key);
if (top_waiter)
- return attach_to_pi_state(uval, top_waiter->pi_state, ps);
+ return attach_to_pi_state(uaddr, uval, top_waiter->pi_state, ps);
/*
* No waiter and user TID is 0. We are here because the
@@ -1334,6 +1416,7 @@ static int wake_futex_pi(u32 __user *uad
if (cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)) {
ret = -EFAULT;
+
} else if (curval != uval) {
/*
* If a unconditional UNLOCK_PI operation (user space did not
@@ -1346,6 +1429,7 @@ static int wake_futex_pi(u32 __user *uad
else
ret = -EINVAL;
}
+
if (ret) {
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
return ret;
@@ -1821,7 +1905,7 @@ static int futex_requeue(u32 __user *uad
* If that call succeeds then we have pi_state and an
* initial refcount on it.
*/
- ret = lookup_pi_state(ret, hb2, &key2, &pi_state);
+ ret = lookup_pi_state(uaddr2, ret, hb2, &key2, &pi_state);
}
switch (ret) {
@@ -2120,10 +2204,13 @@ static int fixup_pi_state_owner(u32 __us
{
u32 newtid = task_pid_vnr(newowner) | FUTEX_WAITERS;
struct futex_pi_state *pi_state = q->pi_state;
- struct task_struct *oldowner = pi_state->owner;
u32 uval, uninitialized_var(curval), newval;
+ struct task_struct *oldowner;
int ret;
+ raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
+
+ oldowner = pi_state->owner;
/* Owner died? */
if (!pi_state->owner)
newtid |= FUTEX_OWNER_DIED;
@@ -2139,11 +2226,10 @@ static int fixup_pi_state_owner(u32 __us
* because we can fault here. Imagine swapped out pages or a fork
* that marked all the anonymous memory readonly for cow.
*
- * Modifying pi_state _before_ the user space value would
- * leave the pi_state in an inconsistent state when we fault
- * here, because we need to drop the hash bucket lock to
- * handle the fault. This might be observed in the PID check
- * in lookup_pi_state.
+ * Modifying pi_state _before_ the user space value would leave the
+ * pi_state in an inconsistent state when we fault here, because we
+ * need to drop the locks to handle the fault. This might be observed
+ * in the PID check in lookup_pi_state.
*/
retry:
if (get_futex_value_locked(&uval, uaddr))
@@ -2164,47 +2250,60 @@ static int fixup_pi_state_owner(u32 __us
* itself.
*/
if (pi_state->owner != NULL) {
- raw_spin_lock_irq(&pi_state->owner->pi_lock);
+ raw_spin_lock(&pi_state->owner->pi_lock);
WARN_ON(list_empty(&pi_state->list));
list_del_init(&pi_state->list);
- raw_spin_unlock_irq(&pi_state->owner->pi_lock);
+ raw_spin_unlock(&pi_state->owner->pi_lock);
}
pi_state->owner = newowner;
- raw_spin_lock_irq(&newowner->pi_lock);
+ raw_spin_lock(&newowner->pi_lock);
WARN_ON(!list_empty(&pi_state->list));
list_add(&pi_state->list, &newowner->pi_state_list);
- raw_spin_unlock_irq(&newowner->pi_lock);
+ raw_spin_unlock(&newowner->pi_lock);
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
+
return 0;
/*
- * To handle the page fault we need to drop the hash bucket
- * lock here. That gives the other task (either the highest priority
- * waiter itself or the task which stole the rtmutex) the
- * chance to try the fixup of the pi_state. So once we are
- * back from handling the fault we need to check the pi_state
- * after reacquiring the hash bucket lock and before trying to
- * do another fixup. When the fixup has been done already we
- * simply return.
+ * To handle the page fault we need to drop the locks here. That gives
+ * the other task (either the highest priority waiter itself or the
+ * task which stole the rtmutex) the chance to try the fixup of the
+ * pi_state. So once we are back from handling the fault we need to
+ * check the pi_state after reacquiring the locks and before trying to
+ * do another fixup. When the fixup has been done already we simply
+ * return.
+ *
+ * Note: we hold both hb->lock and pi_mutex->wait_lock. We can safely
+ * drop hb->lock since the caller owns the hb -> futex_q relation.
+ * Dropping the pi_mutex->wait_lock requires the state revalidate.
*/
handle_fault:
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
spin_unlock(q->lock_ptr);
ret = fault_in_user_writeable(uaddr);
spin_lock(q->lock_ptr);
+ raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
/*
* Check if someone else fixed it for us:
*/
- if (pi_state->owner != oldowner)
- return 0;
+ if (pi_state->owner != oldowner) {
+ ret = 0;
+ goto out_unlock;
+ }
if (ret)
- return ret;
+ goto out_unlock;
goto retry;
+
+out_unlock:
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
+ return ret;
}
static long futex_wait_restart(struct restart_block *restart);

View File

@ -0,0 +1,76 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:53 +0100
Subject: [PATCH] futex: Cleanup refcounting
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit bf92cf3a5100f5a0d5f9834787b130159397cb22
Add a put_pit_state() as counterpart for get_pi_state() so the refcounting
becomes consistent.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.801778516@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -800,7 +800,7 @@ static int refill_pi_state_cache(void)
return 0;
}
-static struct futex_pi_state * alloc_pi_state(void)
+static struct futex_pi_state *alloc_pi_state(void)
{
struct futex_pi_state *pi_state = current->pi_state_cache;
@@ -810,6 +810,11 @@ static struct futex_pi_state * alloc_pi_
return pi_state;
}
+static void get_pi_state(struct futex_pi_state *pi_state)
+{
+ WARN_ON_ONCE(!atomic_inc_not_zero(&pi_state->refcount));
+}
+
/*
* Drops a reference to the pi_state object and frees or caches it
* when the last reference is gone.
@@ -854,7 +859,7 @@ static void put_pi_state(struct futex_pi
* Look up the task based on what TID userspace gave us.
* We dont trust it.
*/
-static struct task_struct * futex_find_get_task(pid_t pid)
+static struct task_struct *futex_find_get_task(pid_t pid)
{
struct task_struct *p;
@@ -1101,7 +1106,7 @@ static int attach_to_pi_state(u32 __user
goto out_einval;
out_attach:
- atomic_inc(&pi_state->refcount);
+ get_pi_state(pi_state);
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
*ps = pi_state;
return 0;
@@ -1988,7 +1993,7 @@ static int futex_requeue(u32 __user *uad
* refcount on the pi_state and store the pointer in
* the futex_q object of the waiter.
*/
- atomic_inc(&pi_state->refcount);
+ get_pi_state(pi_state);
this->pi_state = pi_state;
ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex,
this->rt_waiter,

View File

@ -0,0 +1,140 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:54 +0100
Subject: [PATCH] futex: Rework inconsistent rt_mutex/futex_q state
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 73d786bd043ebc855f349c81ea805f6b11cbf2aa
There is a weird state in the futex_unlock_pi() path when it interleaves
with a concurrent futex_lock_pi() at the point where it drops hb->lock.
In this case, it can happen that the rt_mutex wait_list and the futex_q
disagree on pending waiters, in particular rt_mutex will find no pending
waiters where futex_q thinks there are. In this case the rt_mutex unlock
code cannot assign an owner.
The futex side fixup code has to cleanup the inconsistencies with quite a
bunch of interesting corner cases.
Simplify all this by changing wake_futex_pi() to return -EAGAIN when this
situation occurs. This then gives the futex_lock_pi() code the opportunity
to continue and the retried futex_unlock_pi() will now observe a coherent
state.
The only problem is that this breaks RT timeliness guarantees. That
is, consider the following scenario:
T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
CPU0
T1
lock_pi()
queue_me() <- Waiter is visible
preemption
T2
unlock_pi()
loops with -EAGAIN forever
Which is undesirable for PI primitives. Future patches will rectify
this.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.850383690@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 50 ++++++++++++++------------------------------------
1 file changed, 14 insertions(+), 36 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1402,12 +1402,19 @@ static int wake_futex_pi(u32 __user *uad
new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);
/*
- * It is possible that the next waiter (the one that brought
- * top_waiter owner to the kernel) timed out and is no longer
- * waiting on the lock.
+ * When we interleave with futex_lock_pi() where it does
+ * rt_mutex_timed_futex_lock(), we might observe @this futex_q waiter,
+ * but the rt_mutex's wait_list can be empty (either still, or again,
+ * depending on which side we land).
+ *
+ * When this happens, give up our locks and try again, giving the
+ * futex_lock_pi() instance time to complete, either by waiting on the
+ * rtmutex or removing itself from the futex queue.
*/
- if (!new_owner)
- new_owner = top_waiter->task;
+ if (!new_owner) {
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
+ return -EAGAIN;
+ }
/*
* We pass it to the next owner. The WAITERS bit is always
@@ -2330,7 +2337,6 @@ static long futex_wait_restart(struct re
*/
static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
{
- struct task_struct *owner;
int ret = 0;
if (locked) {
@@ -2344,43 +2350,15 @@ static int fixup_owner(u32 __user *uaddr
}
/*
- * Catch the rare case, where the lock was released when we were on the
- * way back before we locked the hash bucket.
- */
- if (q->pi_state->owner == current) {
- /*
- * Try to get the rt_mutex now. This might fail as some other
- * task acquired the rt_mutex after we removed ourself from the
- * rt_mutex waiters list.
- */
- if (rt_mutex_futex_trylock(&q->pi_state->pi_mutex)) {
- locked = 1;
- goto out;
- }
-
- /*
- * pi_state is incorrect, some other task did a lock steal and
- * we returned due to timeout or signal without taking the
- * rt_mutex. Too late.
- */
- raw_spin_lock_irq(&q->pi_state->pi_mutex.wait_lock);
- owner = rt_mutex_owner(&q->pi_state->pi_mutex);
- if (!owner)
- owner = rt_mutex_next_owner(&q->pi_state->pi_mutex);
- raw_spin_unlock_irq(&q->pi_state->pi_mutex.wait_lock);
- ret = fixup_pi_state_owner(uaddr, q, owner);
- goto out;
- }
-
- /*
* Paranoia check. If we did not take the lock, then we should not be
* the owner of the rt_mutex.
*/
- if (rt_mutex_owner(&q->pi_state->pi_mutex) == current)
+ if (rt_mutex_owner(&q->pi_state->pi_mutex) == current) {
printk(KERN_ERR "fixup_owner: ret = %d pi-mutex: %p "
"pi-state %p\n", ret,
q->pi_state->pi_mutex.owner,
q->pi_state->owner);
+ }
out:
return ret ? ret : locked;

View File

@ -0,0 +1,358 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:55 +0100
Subject: [PATCH] futex: Pull rt_mutex_futex_unlock() out from under hb->lock
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 16ffa12d742534d4ff73e8b3a4e81c1de39196f0
There's a number of 'interesting' problems, all caused by holding
hb->lock while doing the rt_mutex_unlock() equivalient.
Notably:
- a PI inversion on hb->lock; and,
- a SCHED_DEADLINE crash because of pointer instability.
The previous changes:
- changed the locking rules to cover {uval,pi_state} with wait_lock.
- allow to do rt_mutex_futex_unlock() without dropping wait_lock; which in
turn allows to rely on wait_lock atomicity completely.
- simplified the waiter conundrum.
It's now sufficient to hold rtmutex::wait_lock and a reference on the
pi_state to protect the state consistency, so hb->lock can be dropped
before calling rt_mutex_futex_unlock().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.900002056@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 154 +++++++++++++++++++++++++++++++++++++--------------------
1 file changed, 100 insertions(+), 54 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -919,10 +919,12 @@ void exit_pi_state_list(struct task_stru
pi_state->owner = NULL;
raw_spin_unlock_irq(&curr->pi_lock);
- rt_mutex_futex_unlock(&pi_state->pi_mutex);
-
+ get_pi_state(pi_state);
spin_unlock(&hb->lock);
+ rt_mutex_futex_unlock(&pi_state->pi_mutex);
+ put_pi_state(pi_state);
+
raw_spin_lock_irq(&curr->pi_lock);
}
raw_spin_unlock_irq(&curr->pi_lock);
@@ -1035,6 +1037,11 @@ static int attach_to_pi_state(u32 __user
* has dropped the hb->lock in between queue_me() and unqueue_me_pi(),
* which in turn means that futex_lock_pi() still has a reference on
* our pi_state.
+ *
+ * The waiter holding a reference on @pi_state also protects against
+ * the unlocked put_pi_state() in futex_unlock_pi(), futex_lock_pi()
+ * and futex_wait_requeue_pi() as it cannot go to 0 and consequently
+ * free pi_state before we can take a reference ourselves.
*/
WARN_ON(!atomic_read(&pi_state->refcount));
@@ -1378,48 +1385,40 @@ static void mark_wake_futex(struct wake_
smp_store_release(&q->lock_ptr, NULL);
}
-static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *top_waiter,
- struct futex_hash_bucket *hb)
+/*
+ * Caller must hold a reference on @pi_state.
+ */
+static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_state)
{
- struct task_struct *new_owner;
- struct futex_pi_state *pi_state = top_waiter->pi_state;
u32 uninitialized_var(curval), newval;
+ struct task_struct *new_owner;
+ bool deboost = false;
WAKE_Q(wake_q);
- bool deboost;
int ret = 0;
- if (!pi_state)
- return -EINVAL;
-
- /*
- * If current does not own the pi_state then the futex is
- * inconsistent and user space fiddled with the futex value.
- */
- if (pi_state->owner != current)
- return -EINVAL;
-
raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);
-
- /*
- * When we interleave with futex_lock_pi() where it does
- * rt_mutex_timed_futex_lock(), we might observe @this futex_q waiter,
- * but the rt_mutex's wait_list can be empty (either still, or again,
- * depending on which side we land).
- *
- * When this happens, give up our locks and try again, giving the
- * futex_lock_pi() instance time to complete, either by waiting on the
- * rtmutex or removing itself from the futex queue.
- */
if (!new_owner) {
- raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
- return -EAGAIN;
+ /*
+ * Since we held neither hb->lock nor wait_lock when coming
+ * into this function, we could have raced with futex_lock_pi()
+ * such that we might observe @this futex_q waiter, but the
+ * rt_mutex's wait_list can be empty (either still, or again,
+ * depending on which side we land).
+ *
+ * When this happens, give up our locks and try again, giving
+ * the futex_lock_pi() instance time to complete, either by
+ * waiting on the rtmutex or removing itself from the futex
+ * queue.
+ */
+ ret = -EAGAIN;
+ goto out_unlock;
}
/*
- * We pass it to the next owner. The WAITERS bit is always
- * kept enabled while there is PI state around. We cleanup the
- * owner died bit, because we are the owner.
+ * We pass it to the next owner. The WAITERS bit is always kept
+ * enabled while there is PI state around. We cleanup the owner
+ * died bit, because we are the owner.
*/
newval = FUTEX_WAITERS | task_pid_vnr(new_owner);
@@ -1442,10 +1441,8 @@ static int wake_futex_pi(u32 __user *uad
ret = -EINVAL;
}
- if (ret) {
- raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
- return ret;
- }
+ if (ret)
+ goto out_unlock;
raw_spin_lock(&pi_state->owner->pi_lock);
WARN_ON(list_empty(&pi_state->list));
@@ -1463,15 +1460,15 @@ static int wake_futex_pi(u32 __user *uad
*/
deboost = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q);
+out_unlock:
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
- spin_unlock(&hb->lock);
if (deboost) {
wake_up_q(&wake_q);
rt_mutex_adjust_prio(current);
}
- return 0;
+ return ret;
}
/*
@@ -2230,7 +2227,8 @@ static int fixup_pi_state_owner(u32 __us
/*
* We are here either because we stole the rtmutex from the
* previous highest priority waiter or we are the highest priority
- * waiter but failed to get the rtmutex the first time.
+ * waiter but have failed to get the rtmutex the first time.
+ *
* We have to replace the newowner TID in the user space variable.
* This must be atomic as we have to preserve the owner died bit here.
*
@@ -2247,7 +2245,7 @@ static int fixup_pi_state_owner(u32 __us
if (get_futex_value_locked(&uval, uaddr))
goto handle_fault;
- while (1) {
+ for (;;) {
newval = (uval & FUTEX_OWNER_DIED) | newtid;
if (cmpxchg_futex_value_locked(&curval, uaddr, uval, newval))
@@ -2343,6 +2341,10 @@ static int fixup_owner(u32 __user *uaddr
/*
* Got the lock. We might not be the anticipated owner if we
* did a lock-steal - fix up the PI-state in that case:
+ *
+ * We can safely read pi_state->owner without holding wait_lock
+ * because we now own the rt_mutex, only the owner will attempt
+ * to change it.
*/
if (q->pi_state->owner != current)
ret = fixup_pi_state_owner(uaddr, q, current);
@@ -2582,6 +2584,7 @@ static int futex_lock_pi(u32 __user *uad
ktime_t *time, int trylock)
{
struct hrtimer_sleeper timeout, *to = NULL;
+ struct futex_pi_state *pi_state = NULL;
struct futex_hash_bucket *hb;
struct futex_q q = futex_q_init;
int res, ret;
@@ -2668,12 +2671,19 @@ static int futex_lock_pi(u32 __user *uad
* If fixup_owner() faulted and was unable to handle the fault, unlock
* it and return the fault to userspace.
*/
- if (ret && (rt_mutex_owner(&q.pi_state->pi_mutex) == current))
- rt_mutex_futex_unlock(&q.pi_state->pi_mutex);
+ if (ret && (rt_mutex_owner(&q.pi_state->pi_mutex) == current)) {
+ pi_state = q.pi_state;
+ get_pi_state(pi_state);
+ }
/* Unqueue and drop the lock */
unqueue_me_pi(&q);
+ if (pi_state) {
+ rt_mutex_futex_unlock(&pi_state->pi_mutex);
+ put_pi_state(pi_state);
+ }
+
goto out_put_key;
out_unlock_put_key:
@@ -2736,10 +2746,36 @@ static int futex_unlock_pi(u32 __user *u
*/
top_waiter = futex_top_waiter(hb, &key);
if (top_waiter) {
- ret = wake_futex_pi(uaddr, uval, top_waiter, hb);
+ struct futex_pi_state *pi_state = top_waiter->pi_state;
+
+ ret = -EINVAL;
+ if (!pi_state)
+ goto out_unlock;
+
+ /*
+ * If current does not own the pi_state then the futex is
+ * inconsistent and user space fiddled with the futex value.
+ */
+ if (pi_state->owner != current)
+ goto out_unlock;
+
+ /*
+ * Grab a reference on the pi_state and drop hb->lock.
+ *
+ * The reference ensures pi_state lives, dropping the hb->lock
+ * is tricky.. wake_futex_pi() will take rt_mutex::wait_lock to
+ * close the races against futex_lock_pi(), but in case of
+ * _any_ fail we'll abort and retry the whole deal.
+ */
+ get_pi_state(pi_state);
+ spin_unlock(&hb->lock);
+
+ ret = wake_futex_pi(uaddr, uval, pi_state);
+
+ put_pi_state(pi_state);
+
/*
- * In case of success wake_futex_pi dropped the hash
- * bucket lock.
+ * Success, we're done! No tricky corner cases.
*/
if (!ret)
goto out_putkey;
@@ -2754,7 +2790,6 @@ static int futex_unlock_pi(u32 __user *u
* setting the FUTEX_WAITERS bit. Try again.
*/
if (ret == -EAGAIN) {
- spin_unlock(&hb->lock);
put_futex_key(&key);
goto retry;
}
@@ -2762,7 +2797,7 @@ static int futex_unlock_pi(u32 __user *u
* wake_futex_pi has detected invalid state. Tell user
* space.
*/
- goto out_unlock;
+ goto out_putkey;
}
/*
@@ -2772,8 +2807,10 @@ static int futex_unlock_pi(u32 __user *u
* preserve the WAITERS bit not the OWNER_DIED one. We are the
* owner.
*/
- if (cmpxchg_futex_value_locked(&curval, uaddr, uval, 0))
+ if (cmpxchg_futex_value_locked(&curval, uaddr, uval, 0)) {
+ spin_unlock(&hb->lock);
goto pi_faulted;
+ }
/*
* If uval has changed, let user space handle it.
@@ -2787,7 +2824,6 @@ static int futex_unlock_pi(u32 __user *u
return ret;
pi_faulted:
- spin_unlock(&hb->lock);
put_futex_key(&key);
ret = fault_in_user_writeable(uaddr);
@@ -2891,6 +2927,7 @@ static int futex_wait_requeue_pi(u32 __u
u32 __user *uaddr2)
{
struct hrtimer_sleeper timeout, *to = NULL;
+ struct futex_pi_state *pi_state = NULL;
struct rt_mutex_waiter rt_waiter;
struct futex_hash_bucket *hb;
union futex_key key2 = FUTEX_KEY_INIT;
@@ -2975,8 +3012,10 @@ static int futex_wait_requeue_pi(u32 __u
if (q.pi_state && (q.pi_state->owner != current)) {
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, &q, current);
- if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current)
- rt_mutex_futex_unlock(&q.pi_state->pi_mutex);
+ if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current) {
+ pi_state = q.pi_state;
+ get_pi_state(pi_state);
+ }
/*
* Drop the reference to the pi state which
* the requeue_pi() code acquired for us.
@@ -3015,13 +3054,20 @@ static int futex_wait_requeue_pi(u32 __u
* the fault, unlock the rt_mutex and return the fault to
* userspace.
*/
- if (ret && rt_mutex_owner(pi_mutex) == current)
- rt_mutex_futex_unlock(pi_mutex);
+ if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current) {
+ pi_state = q.pi_state;
+ get_pi_state(pi_state);
+ }
/* Unqueue and drop the lock. */
unqueue_me_pi(&q);
}
+ if (pi_state) {
+ rt_mutex_futex_unlock(&pi_state->pi_mutex);
+ put_pi_state(pi_state);
+ }
+
if (ret == -EINTR) {
/*
* We've already been requeued, but cannot restart by calling

View File

@ -0,0 +1,80 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:56 +0100
Subject: [PATCH] futex,rt_mutex: Introduce rt_mutex_init_waiter()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 50809358dd7199aa7ce232f6877dd09ec30ef374
Since there's already two copies of this code, introduce a helper now
before adding a third one.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.950039479@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 5 +----
kernel/locking/rtmutex.c | 12 +++++++++---
kernel/locking/rtmutex_common.h | 1 +
3 files changed, 11 insertions(+), 7 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2954,10 +2954,7 @@ static int futex_wait_requeue_pi(u32 __u
* The waiter is allocated on our stack, manipulated by the requeue
* code while we sleep on uaddr.
*/
- debug_rt_mutex_init_waiter(&rt_waiter);
- RB_CLEAR_NODE(&rt_waiter.pi_tree_entry);
- RB_CLEAR_NODE(&rt_waiter.tree_entry);
- rt_waiter.task = NULL;
+ rt_mutex_init_waiter(&rt_waiter);
ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2, VERIFY_WRITE);
if (unlikely(ret != 0))
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1151,6 +1151,14 @@ void rt_mutex_adjust_pi(struct task_stru
next_lock, NULL, task);
}
+void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter)
+{
+ debug_rt_mutex_init_waiter(waiter);
+ RB_CLEAR_NODE(&waiter->pi_tree_entry);
+ RB_CLEAR_NODE(&waiter->tree_entry);
+ waiter->task = NULL;
+}
+
/**
* __rt_mutex_slowlock() - Perform the wait-wake-try-to-take loop
* @lock: the rt_mutex to take
@@ -1233,9 +1241,7 @@ rt_mutex_slowlock(struct rt_mutex *lock,
unsigned long flags;
int ret = 0;
- debug_rt_mutex_init_waiter(&waiter);
- RB_CLEAR_NODE(&waiter.pi_tree_entry);
- RB_CLEAR_NODE(&waiter.tree_entry);
+ rt_mutex_init_waiter(&waiter);
/*
* Technically we could use raw_spin_[un]lock_irq() here, but this can
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -103,6 +103,7 @@ extern void rt_mutex_init_proxy_locked(s
struct task_struct *proxy_owner);
extern void rt_mutex_proxy_unlock(struct rt_mutex *lock,
struct task_struct *proxy_owner);
+extern void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter);
extern int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter,
struct task_struct *task);

View File

@ -0,0 +1,159 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:57 +0100
Subject: [PATCH] futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 38d589f2fd08f1296aea3ce62bebd185125c6d81
With the ultimate goal of keeping rt_mutex wait_list and futex_q waiters
consistent it's necessary to split 'rt_mutex_futex_lock()' into finer
parts, such that only the actual blocking can be done without hb->lock
held.
Split split_mutex_finish_proxy_lock() into two parts, one that does the
blocking and one that does remove_waiter() when the lock acquire failed.
When the rtmutex was acquired successfully the waiter can be removed in the
acquisiton path safely, since there is no concurrency on the lock owner.
This means that, except for futex_lock_pi(), all wait_list modifications
are done with both hb->lock and wait_lock held.
[bigeasy@linutronix.de: fix for futex_requeue_pi_signal_restart]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104152.001659630@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 7 +++--
kernel/locking/rtmutex.c | 52 ++++++++++++++++++++++++++++++++++------
kernel/locking/rtmutex_common.h | 8 +++---
3 files changed, 55 insertions(+), 12 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3030,10 +3030,13 @@ static int futex_wait_requeue_pi(u32 __u
*/
WARN_ON(!q.pi_state);
pi_mutex = &q.pi_state->pi_mutex;
- ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter);
- debug_rt_mutex_free_waiter(&rt_waiter);
+ ret = rt_mutex_wait_proxy_lock(pi_mutex, to, &rt_waiter);
spin_lock(q.lock_ptr);
+ if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter))
+ ret = 0;
+
+ debug_rt_mutex_free_waiter(&rt_waiter);
/*
* Fixup the pi_state owner and possibly acquire the lock if we
* haven't already.
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1743,21 +1743,23 @@ struct task_struct *rt_mutex_next_owner(
}
/**
- * rt_mutex_finish_proxy_lock() - Complete lock acquisition
+ * rt_mutex_wait_proxy_lock() - Wait for lock acquisition
* @lock: the rt_mutex we were woken on
* @to: the timeout, null if none. hrtimer should already have
* been started.
* @waiter: the pre-initialized rt_mutex_waiter
*
- * Complete the lock acquisition started our behalf by another thread.
+ * Wait for the the lock acquisition started on our behalf by
+ * rt_mutex_start_proxy_lock(). Upon failure, the caller must call
+ * rt_mutex_cleanup_proxy_lock().
*
* Returns:
* 0 - success
* <0 - error, one of -EINTR, -ETIMEDOUT
*
- * Special API call for PI-futex requeue support
+ * Special API call for PI-futex support
*/
-int rt_mutex_finish_proxy_lock(struct rt_mutex *lock,
+int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
struct hrtimer_sleeper *to,
struct rt_mutex_waiter *waiter)
{
@@ -1770,9 +1772,6 @@ int rt_mutex_finish_proxy_lock(struct rt
/* sleep on the mutex */
ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter);
- if (unlikely(ret))
- remove_waiter(lock, waiter);
-
/*
* try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
* have to fix that up.
@@ -1783,3 +1782,42 @@ int rt_mutex_finish_proxy_lock(struct rt
return ret;
}
+
+/**
+ * rt_mutex_cleanup_proxy_lock() - Cleanup failed lock acquisition
+ * @lock: the rt_mutex we were woken on
+ * @waiter: the pre-initialized rt_mutex_waiter
+ *
+ * Attempt to clean up after a failed rt_mutex_wait_proxy_lock().
+ *
+ * Unless we acquired the lock; we're still enqueued on the wait-list and can
+ * in fact still be granted ownership until we're removed. Therefore we can
+ * find we are in fact the owner and must disregard the
+ * rt_mutex_wait_proxy_lock() failure.
+ *
+ * Returns:
+ * true - did the cleanup, we done.
+ * false - we acquired the lock after rt_mutex_wait_proxy_lock() returned,
+ * caller should disregards its return value.
+ *
+ * Special API call for PI-futex support
+ */
+bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
+ struct rt_mutex_waiter *waiter)
+{
+ bool cleanup = false;
+
+ raw_spin_lock_irq(&lock->wait_lock);
+ /*
+ * Unless we're the owner; we're still enqueued on the wait_list.
+ * So check if we became owner, if not, take us off the wait_list.
+ */
+ if (rt_mutex_owner(lock) != current) {
+ remove_waiter(lock, waiter);
+ fixup_rt_mutex_waiters(lock);
+ cleanup = true;
+ }
+ raw_spin_unlock_irq(&lock->wait_lock);
+
+ return cleanup;
+}
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -107,9 +107,11 @@ extern void rt_mutex_init_waiter(struct
extern int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter,
struct task_struct *task);
-extern int rt_mutex_finish_proxy_lock(struct rt_mutex *lock,
- struct hrtimer_sleeper *to,
- struct rt_mutex_waiter *waiter);
+extern int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
+ struct hrtimer_sleeper *to,
+ struct rt_mutex_waiter *waiter);
+extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
+ struct rt_mutex_waiter *waiter);
extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct hrtimer_sleeper *to);
extern int rt_mutex_futex_trylock(struct rt_mutex *l);

View File

@ -0,0 +1,267 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:58 +0100
Subject: [PATCH] futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit cfafcd117da0216520568c195cb2f6cd1980c4bb
By changing futex_lock_pi() to use rt_mutex_*_proxy_lock() all wait_list
modifications are done under both hb->lock and wait_lock.
This closes the obvious interleave pattern between futex_lock_pi() and
futex_unlock_pi(), but not entirely so. See below:
Before:
futex_lock_pi() futex_unlock_pi()
unlock hb->lock
lock hb->lock
unlock hb->lock
lock rt_mutex->wait_lock
unlock rt_mutex_wait_lock
-EAGAIN
lock rt_mutex->wait_lock
list_add
unlock rt_mutex->wait_lock
schedule()
lock rt_mutex->wait_lock
list_del
unlock rt_mutex->wait_lock
<idem>
-EAGAIN
lock hb->lock
After:
futex_lock_pi() futex_unlock_pi()
lock hb->lock
lock rt_mutex->wait_lock
list_add
unlock rt_mutex->wait_lock
unlock hb->lock
schedule()
lock hb->lock
unlock hb->lock
lock hb->lock
lock rt_mutex->wait_lock
list_del
unlock rt_mutex->wait_lock
lock rt_mutex->wait_lock
unlock rt_mutex_wait_lock
-EAGAIN
unlock hb->lock
It does however solve the earlier starvation/live-lock scenario which got
introduced with the -EAGAIN since unlike the before scenario; where the
-EAGAIN happens while futex_unlock_pi() doesn't hold any locks; in the
after scenario it happens while futex_unlock_pi() actually holds a lock,
and then it is serialized on that lock.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104152.062785528@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 77 ++++++++++++++++++++++++++++------------
kernel/locking/rtmutex.c | 26 +++----------
kernel/locking/rtmutex_common.h | 1
3 files changed, 62 insertions(+), 42 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2097,20 +2097,7 @@ queue_unlock(struct futex_hash_bucket *h
hb_waiters_dec(hb);
}
-/**
- * queue_me() - Enqueue the futex_q on the futex_hash_bucket
- * @q: The futex_q to enqueue
- * @hb: The destination hash bucket
- *
- * The hb->lock must be held by the caller, and is released here. A call to
- * queue_me() is typically paired with exactly one call to unqueue_me(). The
- * exceptions involve the PI related operations, which may use unqueue_me_pi()
- * or nothing if the unqueue is done as part of the wake process and the unqueue
- * state is implicit in the state of woken task (see futex_wait_requeue_pi() for
- * an example).
- */
-static inline void queue_me(struct futex_q *q, struct futex_hash_bucket *hb)
- __releases(&hb->lock)
+static inline void __queue_me(struct futex_q *q, struct futex_hash_bucket *hb)
{
int prio;
@@ -2127,6 +2114,24 @@ static inline void queue_me(struct futex
plist_node_init(&q->list, prio);
plist_add(&q->list, &hb->chain);
q->task = current;
+}
+
+/**
+ * queue_me() - Enqueue the futex_q on the futex_hash_bucket
+ * @q: The futex_q to enqueue
+ * @hb: The destination hash bucket
+ *
+ * The hb->lock must be held by the caller, and is released here. A call to
+ * queue_me() is typically paired with exactly one call to unqueue_me(). The
+ * exceptions involve the PI related operations, which may use unqueue_me_pi()
+ * or nothing if the unqueue is done as part of the wake process and the unqueue
+ * state is implicit in the state of woken task (see futex_wait_requeue_pi() for
+ * an example).
+ */
+static inline void queue_me(struct futex_q *q, struct futex_hash_bucket *hb)
+ __releases(&hb->lock)
+{
+ __queue_me(q, hb);
spin_unlock(&hb->lock);
}
@@ -2585,6 +2590,7 @@ static int futex_lock_pi(u32 __user *uad
{
struct hrtimer_sleeper timeout, *to = NULL;
struct futex_pi_state *pi_state = NULL;
+ struct rt_mutex_waiter rt_waiter;
struct futex_hash_bucket *hb;
struct futex_q q = futex_q_init;
int res, ret;
@@ -2637,25 +2643,52 @@ static int futex_lock_pi(u32 __user *uad
}
}
+ WARN_ON(!q.pi_state);
+
/*
* Only actually queue now that the atomic ops are done:
*/
- queue_me(&q, hb);
+ __queue_me(&q, hb);
- WARN_ON(!q.pi_state);
- /*
- * Block on the PI mutex:
- */
- if (!trylock) {
- ret = rt_mutex_timed_futex_lock(&q.pi_state->pi_mutex, to);
- } else {
+ if (trylock) {
ret = rt_mutex_futex_trylock(&q.pi_state->pi_mutex);
/* Fixup the trylock return value: */
ret = ret ? 0 : -EWOULDBLOCK;
+ goto no_block;
}
+ /*
+ * We must add ourselves to the rt_mutex waitlist while holding hb->lock
+ * such that the hb and rt_mutex wait lists match.
+ */
+ rt_mutex_init_waiter(&rt_waiter);
+ ret = rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current);
+ if (ret) {
+ if (ret == 1)
+ ret = 0;
+
+ goto no_block;
+ }
+
+ spin_unlock(q.lock_ptr);
+
+ if (unlikely(to))
+ hrtimer_start_expires(&to->timer, HRTIMER_MODE_ABS);
+
+ ret = rt_mutex_wait_proxy_lock(&q.pi_state->pi_mutex, to, &rt_waiter);
+
spin_lock(q.lock_ptr);
/*
+ * If we failed to acquire the lock (signal/timeout), we must
+ * first acquire the hb->lock before removing the lock from the
+ * rt_mutex waitqueue, such that we can keep the hb and rt_mutex
+ * wait lists consistent.
+ */
+ if (ret && !rt_mutex_cleanup_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter))
+ ret = 0;
+
+no_block:
+ /*
* Fixup the pi_state owner and possibly acquire the lock if we
* haven't already.
*/
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1491,19 +1491,6 @@ int __sched rt_mutex_lock_interruptible(
EXPORT_SYMBOL_GPL(rt_mutex_lock_interruptible);
/*
- * Futex variant with full deadlock detection.
- * Futex variants must not use the fast-path, see __rt_mutex_futex_unlock().
- */
-int __sched rt_mutex_timed_futex_lock(struct rt_mutex *lock,
- struct hrtimer_sleeper *timeout)
-{
- might_sleep();
-
- return rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE,
- timeout, RT_MUTEX_FULL_CHAINWALK);
-}
-
-/*
* Futex variant, must not use fastpath.
*/
int __sched rt_mutex_futex_trylock(struct rt_mutex *lock)
@@ -1772,12 +1759,6 @@ int rt_mutex_wait_proxy_lock(struct rt_m
/* sleep on the mutex */
ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter);
- /*
- * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
- * have to fix that up.
- */
- fixup_rt_mutex_waiters(lock);
-
raw_spin_unlock_irq(&lock->wait_lock);
return ret;
@@ -1817,6 +1798,13 @@ bool rt_mutex_cleanup_proxy_lock(struct
fixup_rt_mutex_waiters(lock);
cleanup = true;
}
+
+ /*
+ * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
+ * have to fix that up.
+ */
+ fixup_rt_mutex_waiters(lock);
+
raw_spin_unlock_irq(&lock->wait_lock);
return cleanup;
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -113,7 +113,6 @@ extern int rt_mutex_wait_proxy_lock(stru
extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter);
-extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct hrtimer_sleeper *to);
extern int rt_mutex_futex_trylock(struct rt_mutex *l);
extern void rt_mutex_futex_unlock(struct rt_mutex *lock);

View File

@ -0,0 +1,81 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:35:59 +0100
Subject: [PATCH] futex: Futex_unlock_pi() determinism
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit bebe5b514345f09be2c15e414d076b02ecb9cce8
The problem with returning -EAGAIN when the waiter state mismatches is that
it becomes very hard to proof a bounded execution time on the
operation. And seeing that this is a RT operation, this is somewhat
important.
While in practise; given the previous patch; it will be very unlikely to
ever really take more than one or two rounds, proving so becomes rather
hard.
However, now that modifying wait_list is done while holding both hb->lock
and wait_lock, the scenario can be avoided entirely by acquiring wait_lock
while still holding hb-lock. Doing a hand-over, without leaving a hole.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104152.112378812@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1396,15 +1396,10 @@ static int wake_futex_pi(u32 __user *uad
WAKE_Q(wake_q);
int ret = 0;
- raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);
- if (!new_owner) {
+ if (WARN_ON_ONCE(!new_owner)) {
/*
- * Since we held neither hb->lock nor wait_lock when coming
- * into this function, we could have raced with futex_lock_pi()
- * such that we might observe @this futex_q waiter, but the
- * rt_mutex's wait_list can be empty (either still, or again,
- * depending on which side we land).
+ * As per the comment in futex_unlock_pi() this should not happen.
*
* When this happens, give up our locks and try again, giving
* the futex_lock_pi() instance time to complete, either by
@@ -2792,15 +2787,18 @@ static int futex_unlock_pi(u32 __user *u
if (pi_state->owner != current)
goto out_unlock;
+ get_pi_state(pi_state);
/*
- * Grab a reference on the pi_state and drop hb->lock.
+ * Since modifying the wait_list is done while holding both
+ * hb->lock and wait_lock, holding either is sufficient to
+ * observe it.
*
- * The reference ensures pi_state lives, dropping the hb->lock
- * is tricky.. wake_futex_pi() will take rt_mutex::wait_lock to
- * close the races against futex_lock_pi(), but in case of
- * _any_ fail we'll abort and retry the whole deal.
+ * By taking wait_lock while still holding hb->lock, we ensure
+ * there is no point where we hold neither; and therefore
+ * wake_futex_pi() must observe a state consistent with what we
+ * observed.
*/
- get_pi_state(pi_state);
+ raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
spin_unlock(&hb->lock);
ret = wake_futex_pi(uaddr, uval, pi_state);

View File

@ -0,0 +1,204 @@
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Mar 2017 11:36:00 +0100
Subject: [PATCH] futex: Drop hb->lock before enqueueing on the rtmutex
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Upstream commit 56222b212e8edb1cf51f5dd73ff645809b082b40
When PREEMPT_RT_FULL does the spinlock -> rt_mutex substitution the PI
chain code will (falsely) report a deadlock and BUG.
The problem is that it hold hb->lock (now an rt_mutex) while doing
task_blocks_on_rt_mutex on the futex's pi_state::rtmutex. This, when
interleaved just right with futex_unlock_pi() leads it to believe to see an
AB-BA deadlock.
Task1 (holds rt_mutex, Task2 (does FUTEX_LOCK_PI)
does FUTEX_UNLOCK_PI)
lock hb->lock
lock rt_mutex (as per start_proxy)
lock hb->lock
Which is a trivial AB-BA.
It is not an actual deadlock, because it won't be holding hb->lock by the
time it actually blocks on the rt_mutex, but the chainwalk code doesn't
know that and it would be a nightmare to handle this gracefully.
To avoid this problem, do the same as in futex_unlock_pi() and drop
hb->lock after acquiring wait_lock. This still fully serializes against
futex_unlock_pi(), since adding to the wait_list does the very same lock
dance, and removing it holds both locks.
Aside of solving the RT problem this makes the lock and unlock mechanism
symetric and reduces the hb->lock held time.
Reported-and-tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: juri.lelli@arm.com
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104152.161341537@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 30 +++++++++++++++++-------
kernel/locking/rtmutex.c | 49 ++++++++++++++++++++++------------------
kernel/locking/rtmutex_common.h | 3 ++
3 files changed, 52 insertions(+), 30 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2652,20 +2652,33 @@ static int futex_lock_pi(u32 __user *uad
goto no_block;
}
+ rt_mutex_init_waiter(&rt_waiter);
+
/*
- * We must add ourselves to the rt_mutex waitlist while holding hb->lock
- * such that the hb and rt_mutex wait lists match.
+ * On PREEMPT_RT_FULL, when hb->lock becomes an rt_mutex, we must not
+ * hold it while doing rt_mutex_start_proxy(), because then it will
+ * include hb->lock in the blocking chain, even through we'll not in
+ * fact hold it while blocking. This will lead it to report -EDEADLK
+ * and BUG when futex_unlock_pi() interleaves with this.
+ *
+ * Therefore acquire wait_lock while holding hb->lock, but drop the
+ * latter before calling rt_mutex_start_proxy_lock(). This still fully
+ * serializes against futex_unlock_pi() as that does the exact same
+ * lock handoff sequence.
*/
- rt_mutex_init_waiter(&rt_waiter);
- ret = rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current);
+ raw_spin_lock_irq(&q.pi_state->pi_mutex.wait_lock);
+ spin_unlock(q.lock_ptr);
+ ret = __rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current);
+ raw_spin_unlock_irq(&q.pi_state->pi_mutex.wait_lock);
+
if (ret) {
if (ret == 1)
ret = 0;
+ spin_lock(q.lock_ptr);
goto no_block;
}
- spin_unlock(q.lock_ptr);
if (unlikely(to))
hrtimer_start_expires(&to->timer, HRTIMER_MODE_ABS);
@@ -2678,6 +2691,9 @@ static int futex_lock_pi(u32 __user *uad
* first acquire the hb->lock before removing the lock from the
* rt_mutex waitqueue, such that we can keep the hb and rt_mutex
* wait lists consistent.
+ *
+ * In particular; it is important that futex_unlock_pi() can not
+ * observe this inconsistency.
*/
if (ret && !rt_mutex_cleanup_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter))
ret = 0;
@@ -2789,10 +2805,6 @@ static int futex_unlock_pi(u32 __user *u
get_pi_state(pi_state);
/*
- * Since modifying the wait_list is done while holding both
- * hb->lock and wait_lock, holding either is sufficient to
- * observe it.
- *
* By taking wait_lock while still holding hb->lock, we ensure
* there is no point where we hold neither; and therefore
* wake_futex_pi() must observe a state consistent with what we
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1659,31 +1659,14 @@ void rt_mutex_proxy_unlock(struct rt_mut
rt_mutex_set_owner(lock, NULL);
}
-/**
- * rt_mutex_start_proxy_lock() - Start lock acquisition for another task
- * @lock: the rt_mutex to take
- * @waiter: the pre-initialized rt_mutex_waiter
- * @task: the task to prepare
- *
- * Returns:
- * 0 - task blocked on lock
- * 1 - acquired the lock for task, caller should wake it up
- * <0 - error
- *
- * Special API call for FUTEX_REQUEUE_PI support.
- */
-int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
+int __rt_mutex_start_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter,
struct task_struct *task)
{
int ret;
- raw_spin_lock_irq(&lock->wait_lock);
-
- if (try_to_take_rt_mutex(lock, task, NULL)) {
- raw_spin_unlock_irq(&lock->wait_lock);
+ if (try_to_take_rt_mutex(lock, task, NULL))
return 1;
- }
/* We enforce deadlock detection for futexes */
ret = task_blocks_on_rt_mutex(lock, waiter, task,
@@ -1702,12 +1685,36 @@ int rt_mutex_start_proxy_lock(struct rt_
if (unlikely(ret))
remove_waiter(lock, waiter);
- raw_spin_unlock_irq(&lock->wait_lock);
-
debug_rt_mutex_print_deadlock(waiter);
return ret;
}
+
+/**
+ * rt_mutex_start_proxy_lock() - Start lock acquisition for another task
+ * @lock: the rt_mutex to take
+ * @waiter: the pre-initialized rt_mutex_waiter
+ * @task: the task to prepare
+ *
+ * Returns:
+ * 0 - task blocked on lock
+ * 1 - acquired the lock for task, caller should wake it up
+ * <0 - error
+ *
+ * Special API call for FUTEX_REQUEUE_PI support.
+ */
+int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *task)
+{
+ int ret;
+
+ raw_spin_lock_irq(&lock->wait_lock);
+ ret = __rt_mutex_start_proxy_lock(lock, waiter, task);
+ raw_spin_unlock_irq(&lock->wait_lock);
+
+ return ret;
+}
/**
* rt_mutex_next_owner - return the next owner of the lock
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -104,6 +104,9 @@ extern void rt_mutex_init_proxy_locked(s
extern void rt_mutex_proxy_unlock(struct rt_mutex *lock,
struct task_struct *proxy_owner);
extern void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter);
+extern int __rt_mutex_start_proxy_lock(struct rt_mutex *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *task);
extern int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter,
struct task_struct *task);

View File

@ -1,7 +1,7 @@
From: "Yadi.hu" <yadi.hu@windriver.com>
Date: Wed, 10 Dec 2014 10:32:09 +0800
Subject: ARM: enable irq in translation/section permission fault handlers
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Probably happens on all ARM, with
CONFIG_PREEMPT_RT_FULL

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 21 Mar 2013 19:01:05 +0100
Subject: printk: Drop the logbuf_lock more often
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The lock is hold with irgs off. The latency drops 500us+ on my arm bugs
with a "full" buffer after executing "dmesg" on the shell.

View File

@ -1,7 +1,7 @@
From: Josh Cartwright <joshc@ni.com>
Date: Thu, 11 Feb 2016 11:54:01 -0600
Subject: KVM: arm/arm64: downgrade preempt_disable()d region to migrate_disable()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
kvm_arch_vcpu_ioctl_run() disables the use of preemption when updating
the vgic and timer states to prevent the calling task from migrating to

View File

@ -1,7 +1,7 @@
From: Marcelo Tosatti <mtosatti@redhat.com>
Date: Wed, 8 Apr 2015 20:33:25 -0300
Subject: KVM: lapic: mark LAPIC timer handler as irqsafe
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Since lapic timer handler only wakes up a simple waitqueue,
it can be executed from hardirq context.

View File

@ -5,7 +5,7 @@ Cc: Anna Schumaker <anna.schumaker@netapp.com>,
linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
tglx@linutronix.de
Subject: NFSv4: replace seqcount_t with a seqlock_t
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The raw_write_seqcount_begin() in nfs4_reclaim_open_state() bugs me
because it maps to preempt_disable() in -RT which I can't have at this

View File

@ -1,7 +1,7 @@
From: Steven Rostedt <rostedt@goodmis.org>
Date: Wed, 13 Feb 2013 09:26:05 -0500
Subject: acpi/rt: Convert acpi_gbl_hardware lock back to a raw_spinlock_t
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
We hit the following bug with 3.6-rt:

View File

@ -1,7 +1,7 @@
From: Anders Roxell <anders.roxell@linaro.org>
Date: Thu, 14 May 2015 17:52:17 +0200
Subject: arch/arm64: Add lazy preempt support
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
arm64 is missing support for PREEMPT_RT. The main feature which is
lacking is support for lazy preemption. The arch-specific entry code,

View File

@ -1,7 +1,7 @@
From: Benedikt Spranger <b.spranger@linutronix.de>
Date: Sat, 6 Mar 2010 17:47:10 +0100
Subject: ARM: AT91: PIT: Remove irq handler when clock event is unused
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Setup and remove the interrupt handler in clock event mode selection.
This avoids calling the (shared) interrupt handler when the device is

View File

@ -1,7 +1,7 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Sat, 1 May 2010 18:29:35 +0200
Subject: ARM: at91: tclib: Default to tclib timer for RT
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
RT is not too happy about the shared timer interrupt in AT91
devices. Default to tclib timer for RT.

View File

@ -1,7 +1,7 @@
From: Frank Rowand <frank.rowand@am.sony.com>
Date: Mon, 19 Sep 2011 14:51:14 -0700
Subject: arm: Convert arm boot_lock to raw
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The arm boot_lock is used by the secondary processor startup code. The locking
task is the idle thread, which has idle->sched_class == &idle_sched_class.

View File

@ -1,7 +1,7 @@
Subject: arm: Enable highmem for rt
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 13 Feb 2013 11:03:11 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
fixup highmem for ARM.

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Mon, 11 Mar 2013 21:37:27 +0100
Subject: arm/highmem: Flush tlb on unmap
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The tlb should be flushed on unmap and thus make the mapping entry
invalid. This is only done in the non-debug case which does not look

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 22 Dec 2016 17:28:33 +0100
Subject: [PATCH] arm: include definition for cpumask_t
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
This definition gets pulled in by other files. With the (later) split of
RCU and spinlock.h it won't compile anymore.

View File

@ -1,7 +1,7 @@
From: Yang Shi <yang.shi@linaro.org>
Date: Thu, 10 Nov 2016 16:17:55 -0800
Subject: [PATCH] arm: kprobe: replace patch_lock to raw lock
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
When running kprobe on -rt kernel, the below bug is caught:

View File

@ -1,7 +1,7 @@
Subject: arm: Add support for lazy preemption
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 31 Oct 2012 12:04:11 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Implement the arm pieces for lazy preempt.

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Fri, 20 Sep 2013 14:31:54 +0200
Subject: arm/unwind: use a raw_spin_lock
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Mostly unwind is done with irqs enabled however SLUB may call it with
irqs disabled while creating a new SLUB cache.

View File

@ -1,7 +1,7 @@
Subject: arm64/xen: Make XEN depend on !RT
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 12 Oct 2015 11:18:40 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
It's not ready and probably never will be, unless xen folks have a
look at it.

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed, 09 Mar 2016 10:51:06 +0100
Subject: arm: at91: do not disable/enable clocks in a row
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Currently the driver will disable the clock and enable it one line later
if it is switching from periodic mode into one shot.

View File

@ -1,7 +1,7 @@
From: Steven Rostedt <srostedt@redhat.com>
Date: Fri, 3 Jul 2009 08:44:29 -0500
Subject: ata: Do not disable interrupts in ide code for preempt-rt
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Use the local_irq_*_nort variants.

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Fri, 13 Feb 2015 11:01:26 +0100
Subject: block: blk-mq: Use swait
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
| BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:914
| in_atomic(): 1, irqs_disabled(): 0, pid: 255, name: kworker/u257:6

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 29 Jan 2015 15:10:08 +0100
Subject: block/mq: don't complete requests via IPI
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The IPI runs in hardirq context and there are sleeping locks. This patch
moves the completion into a workqueue.

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Tue, 14 Jul 2015 14:26:34 +0200
Subject: block/mq: do not invoke preempt_disable()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
preempt_disable() and get_cpu() don't play well together with the sleeping
locks it tries to allocate later.

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed, 9 Apr 2014 10:37:23 +0200
Subject: block: mq: use cpu_light()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
there is a might sleep splat because get_cpu() disables preemption and
later we grab a lock. As a workaround for this we use get_cpu_light().

View File

@ -1,7 +1,7 @@
Subject: block: Shorten interrupt disabled regions
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 22 Jun 2011 19:47:02 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Moving the blk_sched_flush_plug() call out of the interrupt/preempt
disabled region in the scheduler allows us to replace

View File

@ -1,7 +1,7 @@
Subject: block: Use cpu_chill() for retry loops
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 20 Dec 2012 18:28:26 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Retry loops on RT might loop forever when the modifying side was
preempted. Steven also observed a live lock when there was a

View File

@ -1,7 +1,7 @@
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 3 Jul 2009 08:29:58 -0500
Subject: bug: BUG_ON/WARN_ON variants dependend on RT/!RT
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Introduce RT/NON-RT WARN/BUG statements to avoid ifdefs in the code.

View File

@ -1,7 +1,7 @@
From: Mike Galbraith <umgwanakikbuti@gmail.com>
Date: Sat, 21 Jun 2014 10:09:48 +0200
Subject: memcontrol: Prevent scheduling while atomic in cgroup code
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
mm, memcg: make refill_stock() use get_cpu_light()

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Fri, 13 Feb 2015 15:52:24 +0100
Subject: cgroups: use simple wait in css_release()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
To avoid:
|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:914

View File

@ -1,7 +1,7 @@
From: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Date: Thu, 17 Mar 2016 21:09:43 +0100
Subject: [PATCH] clockevents/drivers/timer-atmel-pit: fix double free_irq
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
clockevents_exchange_device() changes the state from detached to shutdown
and so at that point the IRQ has not yet been requested.

View File

@ -1,7 +1,7 @@
From: Benedikt Spranger <b.spranger@linutronix.de>
Date: Mon, 8 Mar 2010 18:57:04 +0100
Subject: clocksource: TCLIB: Allow higher clock rates for clock events
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
As default the TCLIB uses the 32KiHz base clock rate for clock events.
Add a compile time selection to allow higher clock resulution.

View File

@ -1,7 +1,7 @@
Subject: completion: Use simple wait queues
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 11 Jan 2013 11:23:51 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Completions have no long lasting callbacks and therefor do not need
the complex waitqueue variant. Use simple waitqueues which reduces the

View File

@ -1,7 +1,7 @@
Subject: sched: Use the proper LOCK_OFFSET for cond_resched()
From: Thomas Gleixner <tglx@linutronix.de>
Date: Sun, 17 Jul 2011 22:51:33 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
RT does not increment preempt count when a 'sleeping' spinlock is
locked. Update PREEMPT_LOCK_OFFSET for that case.

View File

@ -1,7 +1,7 @@
Subject: sched: Take RT softirq semantics into account in cond_resched()
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 14 Jul 2011 09:56:44 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The softirq semantics work different on -RT. There is no SOFTIRQ_MASK in
the preemption counter which leads to the BUG_ON() statement in

View File

@ -2,7 +2,7 @@ From: Mike Galbraith <umgwanakikbuti@gmail.com>
Date: Sun, 16 Oct 2016 05:11:54 +0200
Subject: [PATCH] connector/cn_proc: Protect send_msg() with a local lock
on RT
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:931
|in_atomic(): 1, irqs_disabled(): 0, pid: 31807, name: sleep

View File

@ -1,7 +1,7 @@
From: Steven Rostedt <rostedt@goodmis.org>
Date: Thu, 5 Dec 2013 09:16:52 -0500
Subject: cpu hotplug: Document why PREEMPT_RT uses a spinlock
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The patch:

View File

@ -1,7 +1,7 @@
Subject: cpu: Make hotplug.lock a "sleeping" spinlock on RT
From: Steven Rostedt <rostedt@goodmis.org>
Date: Fri, 02 Mar 2012 10:36:57 -0500
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Tasks can block on hotplug.lock in pin_current_cpu(), but their state
might be != RUNNING. So the mutex wakeup will set the state

View File

@ -1,7 +1,7 @@
From: Steven Rostedt <srostedt@redhat.com>
Date: Mon, 16 Jul 2012 08:07:43 +0000
Subject: cpu/rt: Rework cpu down for PREEMPT_RT
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Bringing a CPU down is a pain with the PREEMPT_RT kernel because
tasks can be preempted in many more places than in non-RT. In

View File

@ -1,7 +1,7 @@
From: Steven Rostedt <rostedt@goodmis.org>
Date: Tue, 4 Mar 2014 12:28:32 -0500
Subject: cpu_chill: Add a UNINTERRUPTIBLE hrtimer_nanosleep
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
We hit another bug that was caused by switching cpu_chill() from
msleep() to hrtimer_nanosleep().

View File

@ -1,7 +1,7 @@
From: Tiejun Chen <tiejun.chen@windriver.com>
Subject: cpu_down: move migrate_enable() back
Date: Thu, 7 Nov 2013 10:06:07 +0800
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Commit 08c1ab68, "hotplug-use-migrate-disable.patch", intends to
use migrate_enable()/migrate_disable() to replace that combination

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 9 Apr 2015 15:23:01 +0200
Subject: cpufreq: drop K8's driver from beeing selected
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Ralf posted a picture of a backtrace from

View File

@ -1,7 +1,7 @@
Subject: cpumask: Disable CONFIG_CPUMASK_OFFSTACK for RT
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 14 Dec 2011 01:03:49 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
There are "valid" GFP_ATOMIC allocations such as

View File

@ -1,7 +1,7 @@
From: Mike Galbraith <efault@gmx.de>
Date: Sun, 8 Jan 2017 09:32:25 +0100
Subject: [PATCH] cpuset: Convert callback_lock to raw_spinlock_t
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The two commits below add up to a cpuset might_sleep() splat for RT:

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Fri, 21 Feb 2014 17:24:04 +0100
Subject: crypto: Reduce preempt disabled regions, more algos
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Don Estabrook reported
| kernel: WARNING: CPU: 2 PID: 858 at kernel/sched/core.c:2428 migrate_disable+0xed/0x100()

View File

@ -1,7 +1,7 @@
Subject: debugobjects: Make RT aware
From: Thomas Gleixner <tglx@linutronix.de>
Date: Sun, 17 Jul 2011 21:41:35 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Avoid filling the pool / allocating memory with irqs off().

View File

@ -1,7 +1,7 @@
Subject: dm: Make rt aware
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 14 Nov 2011 23:06:09 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Use the BUG_ON_NORT variant for the irq_disabled() checks. RT has
interrupts legitimately enabled here as we cant deadlock against the

View File

@ -2,7 +2,7 @@ From: Mike Galbraith <umgwanakikbuti@gmail.com>
Date: Thu, 31 Mar 2016 04:08:28 +0200
Subject: [PATCH] drivers/block/zram: Replace bit spinlocks with rtmutex
for -rt
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
They're nondeterministic, and lead to ___might_sleep() splats in -rt.
OTOH, they're a lot less wasteful than an rtmutex per page.

View File

@ -1,7 +1,7 @@
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 3 Jul 2009 08:29:24 -0500
Subject: drivers/net: Use disable_irq_nosync() in 8139too
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Use disable_irq_nosync() instead of disable_irq() as this might be
called in atomic context with netpoll.

View File

@ -1,7 +1,7 @@
From: Steven Rostedt <rostedt@goodmis.org>
Date: Fri, 3 Jul 2009 08:30:00 -0500
Subject: drivers/net: vortex fix locking issues
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Argh, cut and paste wasn't enough...

View File

@ -1,7 +1,7 @@
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 3 Jul 2009 08:29:30 -0500
Subject: drivers: random: Reduce preempt disabled region
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
No need to keep preemption disabled across the whole function.

View File

@ -1,7 +1,7 @@
Subject: tty/serial/omap: Make the locking RT aware
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 28 Jul 2011 13:32:57 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The lock is a sleeping lock and local_irq_save() is not the
optimsation we are looking for. Redo it to make it work on -RT and

View File

@ -1,7 +1,7 @@
Subject: tty/serial/pl011: Make the locking work on RT
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 08 Jan 2013 21:36:51 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The lock is a sleeping lock and local_irq_save() is not the optimsation
we are looking for. Redo it to make it work on -RT and non-RT.

View File

@ -2,7 +2,7 @@ From: Mike Galbraith <umgwanakikbuti@gmail.com>
Date: Thu, 20 Oct 2016 11:15:22 +0200
Subject: [PATCH] drivers/zram: Don't disable preemption in
zcomp_stream_get/put()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
In v4.7, the driver switched to percpu compression streams, disabling
preemption via get/put_cpu_ptr(). Use a per-zcomp_strm lock here. We

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 25 Apr 2013 18:12:52 +0200
Subject: drm/i915: drop trace_i915_gem_ring_dispatch on rt
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
This tracepoint is responsible for:

View File

@ -1,7 +1,7 @@
Subject: drm,i915: Use local_lock/unlock_irq() in intel_pipe_update_start/end()
From: Mike Galbraith <umgwanakikbuti@gmail.com>
Date: Sat, 27 Feb 2016 09:01:42 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
[ 8.014039] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:918

View File

@ -1,7 +1,7 @@
Subject: drm,radeon,i915: Use preempt_disable/enable_rt() where recommended
From: Mike Galbraith <umgwanakikbuti@gmail.com>
Date: Sat, 27 Feb 2016 08:09:11 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
DRM folks identified the spots, so use them.

View File

@ -1,7 +1,7 @@
Subject: fs/epoll: Do not disable preemption on RT
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 08 Jul 2011 16:35:35 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
ep_call_nested() takes a sleeping lock so we can't disable preemption.
The light version is enough since ep_call_nested() doesn't mind beeing

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Mon, 16 Feb 2015 18:49:10 +0100
Subject: fs/aio: simple simple work
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:768
|in_atomic(): 1, irqs_disabled(): 0, pid: 26, name: rcuos/2

View File

@ -1,7 +1,7 @@
Subject: block: Turn off warning which is bogus on RT
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 14 Jun 2011 17:05:09 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
On -RT the context is always with IRQs enabled. Ignore this warning on -RT.

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed, 14 Sep 2016 11:55:23 +0200
Subject: fs/dcache: include wait.h
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Since commit d9171b934526 ("parallel lookups machinery, part 4 (and
last)") dcache.h is using but does not include wait.h. It works as long

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed, 14 Sep 2016 17:57:03 +0200
Subject: [PATCH] fs/dcache: init in_lookup_hashtable
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
in_lookup_hashtable was introduced in commit 94bdd655caba ("parallel
lookups machinery, part 3") and never initialized but since it is in

View File

@ -1,7 +1,7 @@
Subject: fs: dcache: Use cpu_chill() in trylock loops
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 07 Mar 2012 21:00:34 +0100
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Retry loops on RT might loop forever when the modifying side was
preempted. Use cpu_chill() instead of cpu_relax() to let the system

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed, 14 Sep 2016 14:35:49 +0200
Subject: [PATCH] fs/dcache: use swait_queue instead of waitqueue
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
__d_lookup_done() invokes wake_up_all() while holding a hlist_bl_lock()
which disables preemption. As a workaround convert it to swait.

View File

@ -1,7 +1,7 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 18 Mar 2011 10:11:25 +0100
Subject: fs: jbd/jbd2: Make state lock and journal head lock rt safe
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
bit_spin_locks break under RT.

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Mon, 17 Feb 2014 17:30:03 +0100
Subject: fs: jbd2: pull your plug when waiting for space
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Two cps in parallel managed to stall the the ext4 fs. It seems that
journal code is either waiting for locks or sleeping waiting for

View File

@ -1,7 +1,7 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Sun, 19 Jul 2009 08:44:27 -0500
Subject: fs: namespace preemption fix
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
On RT we cannot loop with preemption disabled here as
mnt_make_readonly() might have been preempted. We can safely enable

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 15 Sep 2016 10:51:27 +0200
Subject: [PATCH] fs/nfs: turn rmdir_sem into a semaphore
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The RW semaphore had a reader side which used the _non_owner version
because it most likely took the reader lock in one thread and released it

View File

@ -1,7 +1,7 @@
From: Mike Galbraith <efault@gmx.de>
Date: Fri, 3 Jul 2009 08:44:12 -0500
Subject: fs: ntfs: disable interrupt only on !RT
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
On Sat, 2007-10-27 at 11:44 +0200, Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@yahoo.com.au> wrote:

View File

@ -1,7 +1,7 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 18 Mar 2011 09:18:52 +0100
Subject: buffer_head: Replace bh_uptodate_lock for -rt
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Wrap the bit_spin_lock calls into a separate inline and add the RT
replacements with a real spinlock.

View File

@ -1,7 +1,7 @@
From: Mike Galbraith <umgwanakikbuti@gmail.com>
Date: Sun, 16 Oct 2016 05:08:30 +0200
Subject: [PATCH] ftrace: Fix trace header alignment
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Line up helper arrows to the right column.

View File

@ -1,7 +1,7 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Sun, 17 Jul 2011 21:56:42 +0200
Subject: trace: Add migrate-disabled counter to tracing output
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---

View File

@ -1,7 +1,7 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 1 Mar 2013 11:17:42 +0100
Subject: futex: Ensure lock/unlock symetry versus pi_lock and hash bucket lock
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
In exit_pi_state_list() we have the following locking construct:
@ -31,7 +31,7 @@ Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -904,7 +904,9 @@ void exit_pi_state_list(struct task_stru
@@ -909,7 +909,9 @@ void exit_pi_state_list(struct task_stru
* task still owns the PI-state:
*/
if (head->next != next) {

View File

@ -1,7 +1,7 @@
From: Steven Rostedt <rostedt@goodmis.org>
Date: Tue, 14 Jul 2015 14:26:34 +0200
Subject: futex: Fix bug on when a requeued RT task times out
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Requeue with timeout causes a bug with PREEMPT_RT_FULL.
@ -66,9 +66,9 @@ Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
}
/*
@@ -1704,6 +1705,35 @@ int rt_mutex_start_proxy_lock(struct rt_
@@ -1696,6 +1697,35 @@ int __rt_mutex_start_proxy_lock(struct r
if (try_to_take_rt_mutex(lock, task, NULL))
return 1;
}
+#ifdef CONFIG_PREEMPT_RT_FULL
+ /*

View File

@ -0,0 +1,59 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 8 Mar 2017 14:23:35 +0100
Subject: [PATCH] futex: workaround migrate_disable/enable in different context
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
migrate_disable()/migrate_enable() takes a different path in atomic() vs
!atomic() context. These little hacks ensure that we don't underflow / overflow
the migrate code counts properly while we lock the hb lockwith interrupts
enabled and unlock it with interrupts disabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2667,9 +2667,18 @@ static int futex_lock_pi(u32 __user *uad
* lock handoff sequence.
*/
raw_spin_lock_irq(&q.pi_state->pi_mutex.wait_lock);
+ /*
+ * the migrate_disable() here disables migration in the in_atomic() fast
+ * path which is enabled again in the following spin_unlock(). We have
+ * one migrate_disable() pending in the slow-path which is reversed
+ * after the raw_spin_unlock_irq() where we leave the atomic context.
+ */
+ migrate_disable();
+
spin_unlock(q.lock_ptr);
ret = __rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current);
raw_spin_unlock_irq(&q.pi_state->pi_mutex.wait_lock);
+ migrate_enable();
if (ret) {
if (ret == 1)
@@ -2811,10 +2820,21 @@ static int futex_unlock_pi(u32 __user *u
* observed.
*/
raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
+ /*
+ * Magic trickery for now to make the RT migrate disable
+ * logic happy. The following spin_unlock() happens with
+ * interrupts disabled so the internal migrate_enable()
+ * won't undo the migrate_disable() which was issued when
+ * locking hb->lock.
+ */
+ migrate_disable();
spin_unlock(&hb->lock);
+ /* Drops pi_state->pi_mutex.wait_lock */
ret = wake_futex_pi(uaddr, uval, pi_state);
+ migrate_enable();
+
put_pi_state(pi_state);
/*

View File

@ -1,7 +1,7 @@
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 3 Jul 2009 08:29:57 -0500
Subject: genirq: Disable irqpoll on -rt
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Creates long latencies for no value

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed, 21 Aug 2013 17:48:46 +0200
Subject: genirq: Do not invoke the affinity callback via a workqueue on RT
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Joe Korty reported, that __irq_set_affinity_locked() schedules a
workqueue while holding a rawlock which results in a might_sleep()

View File

@ -1,7 +1,7 @@
Subject: genirq: Force interrupt thread on RT
From: Thomas Gleixner <tglx@linutronix.de>
Date: Sun, 03 Apr 2011 11:57:29 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Force threaded_irqs and optimize the code (force_irqthreads) in regard
to this.

View File

@ -1,7 +1,7 @@
From: Josh Cartwright <joshc@ni.com>
Date: Thu, 11 Feb 2016 11:54:00 -0600
Subject: genirq: update irq_set_irqchip_state documentation
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
On -rt kernels, the use of migrate_disable()/migrate_enable() is
sufficient to guarantee a task isn't moved to another CPU. Update the

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Tue, 14 Jul 2015 14:26:34 +0200
Subject: gpu: don't check for the lock owner.
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---

View File

@ -1,7 +1,7 @@
From: Mike Galbraith <umgwanakikbuti@gmail.com>
Date: Tue, 24 Mar 2015 08:14:49 +0100
Subject: hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread()
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
do_set_cpus_allowed() is not safe vs ->sched_class change.

View File

@ -1,7 +1,7 @@
Subject: hotplug: Lightweight get online cpus
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 15 Jun 2011 12:36:06 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
get_online_cpus() is a heavy weight function which involves a global
mutex. migrate_disable() wants a simpler construct which prevents only

View File

@ -1,7 +1,7 @@
Subject: hotplug: sync_unplug: No "\n" in task name
From: Yong Zhang <yong.zhang0@gmail.com>
Date: Sun, 16 Oct 2011 18:56:43 +0800
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Otherwise the output will look a little odd.

View File

@ -1,7 +1,7 @@
Subject: hotplug: Use migrate disable on unplug
From: Thomas Gleixner <tglx@linutronix.de>
Date: Sun, 17 Jul 2011 19:35:29 +0200
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Migration needs to be disabled accross the unplug handling to make
sure that the unplug thread is off the unplugged cpu.

View File

@ -1,7 +1,7 @@
From: Yang Shi <yang.shi@windriver.com>
Date: Mon, 16 Sep 2013 14:09:19 -0700
Subject: hrtimer: Move schedule_work call to helper thread
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
When run ltp leapsec_timer test, the following call trace is caught:

View File

@ -1,7 +1,7 @@
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed, 23 Dec 2015 20:57:41 +0100
Subject: hrtimer: enfore 64byte alignment
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
The patch "hrtimer: Fixup hrtimer callback changes for preempt-rt" adds
a list_head expired to struct hrtimer_clock_base and with it we run into

View File

@ -1,7 +1,7 @@
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 3 Jul 2009 08:44:31 -0500
Subject: hrtimer: Fixup hrtimer callback changes for preempt-rt
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
In preempt-rt we can not call the callbacks which take sleeping locks
from the timer interrupt context.

View File

@ -1,7 +1,7 @@
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 3 Jul 2009 08:29:34 -0500
Subject: hrtimers: Prepare full preemption
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt13.tar.xz
Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/older/patches-4.9.18-rt14.tar.xz
Make cancellation of a running callback in softirq context safe
against preemption.

Some files were not shown because too many files have changed in this diff Show More