95 lines
3.5 KiB
Diff
95 lines
3.5 KiB
Diff
Subject: net: Avoid livelock in net_tx_action() on RT
|
|
From: Steven Rostedt <srostedt@redhat.com>
|
|
Date: Thu, 06 Oct 2011 10:48:39 -0400
|
|
|
|
qdisc_lock is taken w/o disabling interrupts or bottom halfs. So code
|
|
holding a qdisc_lock() can be interrupted and softirqs can run on the
|
|
return of interrupt in !RT.
|
|
|
|
The spin_trylock() in net_tx_action() makes sure, that the softirq
|
|
does not deadlock. When the lock can't be acquired q is requeued and
|
|
the NET_TX softirq is raised. That causes the softirq to run over and
|
|
over.
|
|
|
|
That works in mainline as do_softirq() has a retry loop limit and
|
|
leaves the softirq processing in the interrupt return path and
|
|
schedules ksoftirqd. The task which holds qdisc_lock cannot be
|
|
preempted, so the lock is released and either ksoftirqd or the next
|
|
softirq in the return from interrupt path can proceed. Though it's a
|
|
bit strange to actually run MAX_SOFTIRQ_RESTART (10) loops before it
|
|
decides to bail out even if it's clear in the first iteration :)
|
|
|
|
On RT all softirq processing is done in a FIFO thread and we don't
|
|
have a loop limit, so ksoftirqd preempts the lock holder forever and
|
|
unqueues and requeues until the reset button is hit.
|
|
|
|
Due to the forced threading of ksoftirqd on RT we actually cannot
|
|
deadlock on qdisc_lock because it's a "sleeping lock". So it's safe to
|
|
replace the spin_trylock() with a spin_lock(). When contended,
|
|
ksoftirqd is scheduled out and the lock holder can proceed.
|
|
|
|
[ tglx: Massaged changelog and code comments ]
|
|
|
|
Solved-by: Thomas Gleixner <tglx@linuxtronix.de>
|
|
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Tested-by: Carsten Emde <cbe@osadl.org>
|
|
Cc: Clark Williams <williams@redhat.com>
|
|
Cc: John Kacur <jkacur@redhat.com>
|
|
Cc: Luis Claudio R. Goncalves <lclaudio@redhat.com>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
---
|
|
net/core/dev.c | 32 +++++++++++++++++++++++++++++++-
|
|
1 file changed, 31 insertions(+), 1 deletion(-)
|
|
|
|
Index: linux-3.2/net/core/dev.c
|
|
===================================================================
|
|
--- linux-3.2.orig/net/core/dev.c
|
|
+++ linux-3.2/net/core/dev.c
|
|
@@ -3043,6 +3043,36 @@ int netif_rx_ni(struct sk_buff *skb)
|
|
}
|
|
EXPORT_SYMBOL(netif_rx_ni);
|
|
|
|
+#ifdef CONFIG_PREEMPT_RT_FULL
|
|
+/*
|
|
+ * RT runs ksoftirqd as a real time thread and the root_lock is a
|
|
+ * "sleeping spinlock". If the trylock fails then we can go into an
|
|
+ * infinite loop when ksoftirqd preempted the task which actually
|
|
+ * holds the lock, because we requeue q and raise NET_TX softirq
|
|
+ * causing ksoftirqd to loop forever.
|
|
+ *
|
|
+ * It's safe to use spin_lock on RT here as softirqs run in thread
|
|
+ * context and cannot deadlock against the thread which is holding
|
|
+ * root_lock.
|
|
+ *
|
|
+ * On !RT the trylock might fail, but there we bail out from the
|
|
+ * softirq loop after 10 attempts which we can't do on RT. And the
|
|
+ * task holding root_lock cannot be preempted, so the only downside of
|
|
+ * that trylock is that we need 10 loops to decide that we should have
|
|
+ * given up in the first one :)
|
|
+ */
|
|
+static inline int take_root_lock(spinlock_t *lock)
|
|
+{
|
|
+ spin_lock(lock);
|
|
+ return 1;
|
|
+}
|
|
+#else
|
|
+static inline int take_root_lock(spinlock_t *lock)
|
|
+{
|
|
+ return spin_trylock(lock);
|
|
+}
|
|
+#endif
|
|
+
|
|
static void net_tx_action(struct softirq_action *h)
|
|
{
|
|
struct softnet_data *sd = &__get_cpu_var(softnet_data);
|
|
@@ -3081,7 +3111,7 @@ static void net_tx_action(struct softirq
|
|
head = head->next_sched;
|
|
|
|
root_lock = qdisc_lock(q);
|
|
- if (spin_trylock(root_lock)) {
|
|
+ if (take_root_lock(root_lock)) {
|
|
smp_mb__before_clear_bit();
|
|
clear_bit(__QDISC_STATE_SCHED,
|
|
&q->state);
|