From: Peter Zijlstra Date: Fri, 21 Aug 2009 11:56:45 +0200 Subject: timer: delay waking softirqs from the jiffy tick People were complaining about broken balancing with the recent -rt series. A look at /proc/sched_debug yielded: cpu#0, 2393.874 MHz .nr_running : 0 .load : 0 .cpu_load[0] : 177522 .cpu_load[1] : 177522 .cpu_load[2] : 177522 .cpu_load[3] : 177522 .cpu_load[4] : 177522 cpu#1, 2393.874 MHz .nr_running : 4 .load : 4096 .cpu_load[0] : 181618 .cpu_load[1] : 180850 .cpu_load[2] : 180274 .cpu_load[3] : 179938 .cpu_load[4] : 179758 Which indicated the cpu_load computation was hosed, the 177522 value indicates that there is one RT task runnable. Initially I thought the old problem of calculating the cpu_load from a softirq had re-surfaced, however looking at the code shows its being done from scheduler_tick(). [ we really should fix this RT/cfs interaction some day... ] A few trace_printk()s later: sirq-timer/1-19 [001] 174.289744: 19: 50:S ==> [001] 0:140:R -0 [001] 174.290724: enqueue_task_rt: adding task: 19/sirq-timer/1 with load: 177522 -0 [001] 174.290725: 0:140:R + [001] 19: 50:S sirq-timer/1 -0 [001] 174.290730: scheduler_tick: current load: 177522 -0 [001] 174.290732: scheduler_tick: current: 0/swapper -0 [001] 174.290736: 0:140:R ==> [001] 19: 50:R sirq-timer/1 sirq-timer/1-19 [001] 174.290741: dequeue_task_rt: removing task: 19/sirq-timer/1 with load: 177522 sirq-timer/1-19 [001] 174.290743: 19: 50:S ==> [001] 0:140:R We see that we always raise the timer softirq before doing the load calculation. Avoid this by re-ordering the scheduler_tick() call in update_process_times() to occur before we deal with timers. This lowers the load back to sanity and restores regular load-balancing behaviour. Signed-off-by: Peter Zijlstra Signed-off-by: Thomas Gleixner --- kernel/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-stable/kernel/timer.c =================================================================== --- linux-stable.orig/kernel/timer.c +++ linux-stable/kernel/timer.c @@ -1400,13 +1400,13 @@ void update_process_times(int user_tick) /* Note: this timer irq context must be accounted for as well. */ account_process_tick(p, user_tick); + scheduler_tick(); run_local_timers(); rcu_check_callbacks(cpu, user_tick); #ifdef CONFIG_IRQ_WORK if (in_irq()) irq_work_run(); #endif - scheduler_tick(); run_posix_cpu_timers(p); }