August 10, 2012 (v3.5+)

This article was contributed by Paul E. McKenney

Introduction

  1. Stall-Detection Overview
  2. RCU Grace-Period Operation
  3. RCU Grace-Period Implementation
  4. Summary

And then there are of course the inevitable answers to the quick quizzes.

Stall-Detection Overview

If a given CPU (or task, for preemptible RCU) remains in its read-side critical section indefinitely, the corresponding RCU grace period will be indefinitely stalled. However, code that remains in an RCU read-side critical section indefinitely is probably buggy. Possible bugs resulting in this behavior include:
  1. An infinite loop in an RCU read-side critical section.
  2. For CONFIG_PREEMPT=n kernels, an infinite loop anywhere in the kernel that does not include a call to schedule() or some similar function.
  3. An infinite loop in a high-priority real-time thread that preempts some lower-priority task while in an RCU read-side critical section. (This is what CONFIG_RCU_BOOST is designed to handle.)
  4. A hypervisor preempting a guest OS's virtual CPU for too long.
  5. A hardware or software issue that disables the scheduling-clock interrupt on some CPU that is not in dyntick-idle mode. This is quite unlikely, but has really happened.
  6. A hardware or software issue that causes different CPUs to have way different ideas of what time it is. This is quite unlikely, but has really happened.
  7. A hardware failure. This is even more unlikely, but also has really happened at least once.
  8. And last, but unfortunately not least, a bug in the RCU implementation.

Quick Quiz 1: How could a hardware failure result in an RCU CPU stall warning?
Answer

RCU issues stall warnings when the grace period has extended for too long, and uses RCU's data structures to identify which CPUs and tasks are responsible for the stall.

Stall-Detection Operation

RCU CPU stall detection is invoked from rcu_pending(), which in turn is invoked from within the scheduling-clock interrupt and, in CONFIG_RCU_FAST_NO_HZ kernels, upon entry to idle. Each time that it is invoked, it scans the rcu_node structures, using the ->qsmask bits to identify stalled CPUs and the ->blkd_tasks lists (along with the ->gp_tasks pointer) to identify stalled tasks.

Quick Quiz 2: Why not instead invoke RCU CPU stall detection from the grace-period-detection kthread?
Answer

The stalled CPUs and tasks are then printed out, along with additional information if so configured.

With that background, we are now ready to look at the code.

Stall-Detection Implementation

The stall-detection process is controlled by the rcu_cpu_stall_suppress and rcu_cpu_stall_timeout variables. These may be set via boot-time kernel parameters or via sysfs. The rcu_cpu_stall_suppress variable, as its name suggests, suppresses further RCU CPU stall warnings when its value is nonzero. It default initial value is zero, but it is set during panics and other error conditions to avoid the corresponding diagnostics from being interspersed with RCU CPU stall warnings.

The rcu_cpu_stall_timeout variable contains the number of seconds that an RCU grace period may be stalled before stall warnings are issued. Its default is controlled by the CONFIG_RCU_CPU_STALL_TIMEOUT kernel configuration parameter.

The first functions, jiffies_till_stall_check() and record_gp_stall_check_time, compute the time until the next check for CPU stalls.

  1 static int jiffies_till_stall_check(void)
  2 {
  3   int till_stall_check = ACCESS_ONCE(rcu_cpu_stall_timeout);
  4 
  5   if (till_stall_check < 3) {
  6     ACCESS_ONCE(rcu_cpu_stall_timeout) = 3;
  7     till_stall_check = 3;
  8   } else if (till_stall_check > 300) {
  9     ACCESS_ONCE(rcu_cpu_stall_timeout) = 300;
 10     till_stall_check = 300;
 11   }
 12   return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
 13 }
 14 
 15 static void record_gp_stall_check_time(struct rcu_state *rsp)
 16 {
 17   rsp->gp_start = jiffies;
 18   rsp->jiffies_stall = jiffies + jiffies_till_stall_check();
 19 }

The jiffies_till_stall_check() function is shown on lines 1-13 above. Line 3 fetches the current value of rcu_cpu_stall_timeout, which is subject to concurrent updates from sysfs, hence the ACCESS_ONCE(). Lines 5-11 enforce range limits, with a minimum of 3 seconds and a maximum of 300 seconds. Finally, line 12 converts from seconds to jiffies, and adds a delta (five seconds) if CONFIG_PROVE_RCU=y.

The record_gp_stall_check_time() function is shown on lines 15-19. It simply records the start time of the grace period (line 17) and the time of the first CPU-stall check (line 18).

The next set of functions handles the CONFIG_RCU_CPU_STALL_INFO=y case, printing additional state information for the current stall warning.

  1 static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
  2 {
  3   struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
  4   struct timer_list *tltp = &rdtp->idle_gp_timer;
  5 
  6   sprintf(cp, "drain=%d %c timer=%lu",
  7     rdtp->dyntick_drain,
  8     rdtp->dyntick_holdoff == jiffies ? 'H' : '.',
  9     timer_pending(tltp) ? tltp->expires - jiffies : -1);
 10 }
 11 
 12 static void print_cpu_stall_info_begin(void)
 13 {
 14   printk(KERN_CONT "\n");
 15 }
 16 
 17 static void print_cpu_stall_info(struct rcu_state *rsp, int cpu)
 18 {
 19   char fast_no_hz[72];
 20   struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
 21   struct rcu_dynticks *rdtp = rdp->dynticks;
 22   char *ticks_title;
 23   unsigned long ticks_value;
 24 
 25   if (rsp->gpnum == rdp->gpnum) {
 26     ticks_title = "ticks this GP";
 27     ticks_value = rdp->ticks_this_gp;
 28   } else {
 29     ticks_title = "GPs behind";
 30     ticks_value = rsp->gpnum - rdp->gpnum;
 31   }
 32   print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
 33   printk(KERN_ERR "\t%d: (%lu %s) idle=%03x/%llx/%d %s\n",
 34          cpu, ticks_value, ticks_title,
 35          atomic_read(&rdtp->dynticks) & 0xfff,
 36          rdtp->dynticks_nesting, rdtp->dynticks_nmi_nesting,
 37          fast_no_hz);
 38 }
 39 
 40 static void print_cpu_stall_info_end(void)
 41 {
 42   printk(KERN_ERR "\t");
 43 }
 44 
 45 static void zero_cpu_stall_ticks(struct rcu_data *rdp)
 46 {
 47   rdp->ticks_this_gp = 0;
 48 }
 49 
 50 static void increment_cpu_stall_ticks(void)
 51 {
 52   struct rcu_state *rsp;
 53 
 54   for_each_rcu_flavor(rsp)
 55     __this_cpu_ptr(rsp->rda)->ticks_this_gp++;
 56 }

The print_cpu_stall_fast_no_hz() function is shown on lines 1-10. It simply builds a string containing rcu_prepare_for_idle() state for diagnostics purposes. In kernels built with CONFIG_RCU_FAST_NO_HZ=n (in which rcu_prepare_for_idle() is an empty function), it instead builds an empty string.

The print_cpu_stall_info_begin() function is shown on lines 12-15. This function prints out the starting bracket for the information printed by print_cpu_stall_info(). When print_cpu_stall_info() prints full lines, as is the case when CONFIG_RCU_CPU_STALL_INFO=y and as shown here, a newline is printed, otherwise an open curly brace (“{”).

The print_cpu_stall_info() function is shown on lines 17-38 for CONFIG_RCU_CPU_STALL_INFO=y. Line 25 checks to see if the current CPU is aware of the current grace period, and if so lines 26 and 27 record the number of scheduling-clock ticks that this CPU has received during the current grace period. Otherwise, lines 29 and 30 record the number of grace periods that the current CPU has missed. Line 32 invokes print_cpu_stall_fast_no_hz() to pick up rcu_prepare_for_idle() state, and lines 33-37 print the information. In the CONFIG_RCU_CPU_STALL_INFO=n case, print_cpu_stall_info() instead simply prints the current CPU's ID.

Quick Quiz 3: Why wouldn't the current CPU be aware of the current grace period? After all, when printing an RCU CPU stall warning, the current grace period has extended for many seconds, perhaps even minutes!
Answer

The print_cpu_stall_info_end() function, shown on lines 40-43, is the counterpart of print_cpu_stall_info_begin(), and operates quite similarly, but with a closing curly brace (“}”) instead of an open curly brace in the CONFIG_RCU_CPU_STALL_INFO=n case.

The zero_cpu_stall_ticks() is shown on lines 45-48, and zeros the count of scheduling-clock interrupts for the CPU specified by the rcu_data structure passed in. This function is called when the corresponding CPU notices that a new RCU grace period has started. The increment_cpu_stall_ticks() function is shown on lines 50-56, and increments each RCU flavor's count of scheduling-clock interrupts for the current CPU. Both of these functions are empty for CONFIG_RCU_CPU_STALL_INFO=n.

The rcu_print_detail_task_stall_rnp() and rcu_print_detail_task_stall() functions, shown below, prints out RCU CPU stall warning information for the relevant rcu_node structures:

  1 static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
  2 {
  3   unsigned long flags;
  4   struct task_struct *t;
  5 
  6   raw_spin_lock_irqsave(&rnp->lock, flags);
  7   if (!rcu_preempt_blocked_readers_cgp(rnp)) {
  8     raw_spin_unlock_irqrestore(&rnp->lock, flags);
  9     return;
 10   }
 11   t = list_entry(rnp->gp_tasks,
 12            struct task_struct, rcu_node_entry);
 13   list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry)
 14     sched_show_task(t);
 15   raw_spin_unlock_irqrestore(&rnp->lock, flags);
 16 }
 17 
 18 static void rcu_print_detail_task_stall(struct rcu_state *rsp)
 19 {
 20   struct rcu_node *rnp = rcu_get_root(rsp);
 21 
 22   rcu_print_detail_task_stall_rnp(rnp);
 23   rcu_for_each_leaf_node(rsp, rnp)
 24     rcu_print_detail_task_stall_rnp(rnp);
 25 }

The rcu_print_detail_task_stall_rnp() function, shown on lines 1-16, prints out CPU-stall warning information for the specified rcu_node structure. Line 8 acquires the rcu_node structure's ->lock and line 15 releases it. Line 7 checks to see if there are any RCU readers queued on this structure that are blocking the current grace period, and if not, line 8 releases the ->lock and line 9 returns to the caller. Lines 11 and 12 obtain a pointer to the task referenced by this structure's ->gp_tasks pointer, and then lines 13 iterates through the remainder of the ->blkd_tasks lists, starting with the task referenced by ->gp_tasks. For each such task, line 14 dumps its stack.

Quick Quiz 4: But what if the ->gp_tasks pointer is NULL on line 11 of rcu_print_detail_task_stall_rnp? Won't that result in a segmentation fault?
Answer

The rcu_print_detail_task_stall() function is shown on lines 18-25. It simply invokes rcu_print_detail_task_stall_rnp() on the root rcu_node structure and on each leaf rcu_node structure.

Quick Quiz 5: Why doesn't rcu_print_detail_task_stall() also invoke rcu_print_detail_task_stall_rnp() on all rcu_node structures, rather than just the root and leaves?
Answer

The next function is print_other_cpu_stall(), which handles the case where one CPU detects that some other CPU has stalled.

  1 static void print_other_cpu_stall(struct rcu_state *rsp)
  2 {
  3   int cpu;
  4   long delta;
  5   unsigned long flags;
  6   int ndetected = 0;
  7   struct rcu_node *rnp = rcu_get_root(rsp);
  8 
  9   raw_spin_lock_irqsave(&rnp->lock, flags);
 10   delta = jiffies - rsp->jiffies_stall;
 11   if (delta < RCU_STALL_RAT_DELAY || !rcu_gp_in_progress(rsp)) {
 12     raw_spin_unlock_irqrestore(&rnp->lock, flags);
 13     return;
 14   }
 15   rsp->jiffies_stall = jiffies + 3 * jiffies_till_stall_check() + 3;
 16   raw_spin_unlock_irqrestore(&rnp->lock, flags);
 17   printk(KERN_ERR "INFO: %s detected stalls on CPUs/tasks:",
 18          rsp->name);
 19   print_cpu_stall_info_begin();
 20   rcu_for_each_leaf_node(rsp, rnp) {
 21     raw_spin_lock_irqsave(&rnp->lock, flags);
 22     ndetected += rcu_print_task_stall(rnp);
 23     if (rnp->qsmask == 0) {
 24       raw_spin_unlock_irqrestore(&rnp->lock, flags);
 25       continue;
 26     }
 27     for (cpu = 0; cpu <= rnp->grphi - rnp->grplo; cpu++)
 28       if (rnp->qsmask & (1UL << cpu)) {
 29         print_cpu_stall_info(rsp, rnp->grplo + cpu);
 30         ndetected++;
 31       }
 32     raw_spin_unlock_irqrestore(&rnp->lock, flags);
 33   }
 34   rnp = rcu_get_root(rsp);
 35   raw_spin_lock_irqsave(&rnp->lock, flags);
 36   ndetected += rcu_print_task_stall(rnp);
 37   raw_spin_unlock_irqrestore(&rnp->lock, flags);
 38   print_cpu_stall_info_end();
 39   printk(KERN_CONT "(detected by %d, t=%ld jiffies)\n",
 40          smp_processor_id(), (long)(jiffies - rsp->gp_start));
 41   if (ndetected == 0)
 42     printk(KERN_ERR "INFO: Stall ended before state dump start\n");
 43   else if (!trigger_all_cpu_backtrace())
 44     dump_stack();
 45   rcu_print_detail_task_stall(rsp);
 46   force_quiescent_state(rsp);
 47 }

Line 9 acquires the root rcu_node structure's ->lock for the RCU flavor specified by the rsp argument, which is released on line 16. Line 10 computes the number of jiffies since the current grace period began, and line 11 checks to see if this has been long enough to warrant an RCU CPU stall warning, and if not, lines 12 and 13 release the ->lock and return.

Otherwise, execution continues with line 15, which computes the time at which the next stall warning should occur, assuming that the current grace period does not end first. As noted earlier, line 16 releases the ->lock. Lines 17 and 18 print the stall-warning header and line 19 prints the opening bracket for the CPU/task list. Each pass through the loop spanning lines 20-31 prints CPU stall warnings for one of the leaf rcu_node structures. Line 21 acquires the current structure's ->lock and line 22 invokes rcu_print_task_stall() to print information on each preempted task on this structure that is blocking the current grace period (accumulating the number of such tasks in ndetected). If line 23 determines that there are no CPUs corresponding to this structure blocking the current grace period, line 24 releases the ->lock, and line 25 advances to the next leaf. Otherwise, the loop spanning lines 27-31 iterates through each CPU corresponding to this structure, with line 29 invoking print_cpu_stall_info() on each CPU that line 27 determines to be blocking the current grace period, and line 30 counting those CPUs. Finally, line 32 releases this structure's ->lock.

Quick Quiz 6: Yikes!!! The print_other_cpu_stall() function holds rcu_node structure -> locks while called functions invoke printk(). Is that a recipe for horrendous lock contention or what???
Answer

Line 34 picks up a pointer to the root rcu_node structure and line 35 acquires its lock. Line 36 invokes rcu_print_task_stall to print information on each preempted task on the root rcu_node structure that is blocking the current grace period. Line 37 then releases the lock and line 38 prints the closing bracket for the CPU/task list. Lines 39 and 40 print the stall-warning trailer. If line 41 sees that there actually were no tasks or CPUs blocking the current grace period, line 42 tells the sad story, otherwise line 43 attempts to force all CPUs to dump their stacks, and if this attempt is unsuccessful, line 44 dumps the current CPU's stack. Line 45 dumps the stacks of all tasks blocking the current grace period (but only if CONFIG_RCU_CPU_STALL_VERBOSE=y) and line 46 forces quiescent states in an attempt to end the stall.

Quick Quiz 7: Why would the attempt to make other CPUs dump their stacks be subject to failure?
Answer

The following function, print_cpu_stall(), dumps out a stall-warning message when a CPU realizes that it is the one that is still blocking the current grace period.

  1 static void print_cpu_stall(struct rcu_state *rsp)
  2 {
  3   unsigned long flags;
  4   struct rcu_node *rnp = rcu_get_root(rsp);
  5 
  6   printk(KERN_ERR "INFO: %s self-detected stall on CPU", rsp->name);
  7   print_cpu_stall_info_begin();
  8   print_cpu_stall_info(rsp, smp_processor_id());
  9   print_cpu_stall_info_end();
 10   printk(KERN_CONT " (t=%lu jiffies)\n", jiffies - rsp->gp_start);
 11   if (!trigger_all_cpu_backtrace())
 12     dump_stack();
 13   raw_spin_lock_irqsave(&rnp->lock, flags);
 14   if (ULONG_CMP_GE(jiffies, rsp->jiffies_stall))
 15     rsp->jiffies_stall = jiffies +
 16              3 * jiffies_till_stall_check() + 3;
 17   raw_spin_unlock_irqrestore(&rnp->lock, flags);
 18   set_need_resched();
 19 }

Lines 6-10 dump the stall-warning header, opening bracket, per-CPU information (but only for the current CPU), closing bracket, and trailer, respectively. Line 11 attempts to trigger a backtrace on all CPUs, and if that fails, line 12 dumps the current CPU's stack. Line 13 acquires the root rcu_node structure's ->lock (line 17 releases it). Line 14 checks to see if the stall-warning time is in the past, and, if so, lines 15 and 16 compute the time to the next warning (assuming that the current grace period does not end beforehand). Finally, line 18 invokes set_need_resched() in an (probably futile) attempt to get this CPU unstalled.

Quick Quiz 8: Given that print_cpu_stall() is printing an RCU CPU stall warning in response to the stall warning time being in the past, how could it possibly be in the future?
Answer

The check_cpu_stall() function is shown below, which is the top-level function for emitting CPU stall warnings.

  1 static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
  2 {
  3   unsigned long j;
  4   unsigned long js;
  5   struct rcu_node *rnp;
  6 
  7   if (rcu_cpu_stall_suppress)
  8     return;
  9   j = ACCESS_ONCE(jiffies);
 10   js = ACCESS_ONCE(rsp->jiffies_stall);
 11   rnp = rdp->mynode;
 12   if (rcu_gp_in_progress(rsp) &&
 13       (ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
 14     print_cpu_stall(rsp);
 15   } else if (rcu_gp_in_progress(rsp) &&
 16        ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) {
 17     print_other_cpu_stall(rsp);
 18   }
 19 }

Line 7 checks to see if CPU stall warnings have been suppressed, and if so line 8 returns to the caller. Lines 9 and 10 take snapshots of the current jiffies counter and the value that the jiffies counter had at the beginning of the current grace period, and line 11 picks up a pointer to the current CPU's leaf rcu_node structure. If lines 12 and 13 determine that there is a grace period in progress and that this CPU is blocking the current grace period and that it is time for a stall warning, line 14 invokes print_cpu_stall() to issue that warning. Otherwise, if lines 15 and 16 determine that there is a grace period in progress and that the stall-warning time was a few jiffies ago, line 17 invokes print_other_cpu_stall to print a stall warning on behalf of some other CPU or task.

Quick Quiz 9: Why doesn't line 12 of check_cpu_stall() need to check that there is an RCU grace period in progress?
Answer

Finally, the following functions handle automatic suppression of stall warnings when other error conditions occur. This suppression is carried out to avoid corrupting long console messages with irrelevant stall-warning messages—the stall might well be a side-effect of the other error condition.

  1 static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
  2 {
  3   rcu_cpu_stall_suppress = 1;
  4   return NOTIFY_DONE;
  5 }
  6 
  7 void rcu_cpu_stall_reset(void)
  8 {
  9   struct rcu_state *rsp;
 10 
 11   for_each_rcu_flavor(rsp)
 12     rsp->jiffies_stall = jiffies + ULONG_MAX / 2;
 13 }
 14 
 15 static struct notifier_block rcu_panic_block = {
 16   .notifier_call = rcu_panic,
 17 };
 18 
 19 static void __init check_cpu_stall_init(void)
 20 {
 21   atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block);
 22 }

Lines&nbps;1-5 show the rcu_panic() function, which is a notifier function that suppresses stall warnings when kernel panics occur. Lines 7-13 show the rcu_cpu_stall_reset() function, which (nearly) indefinitely delays further stall warnings for the current grace period. This function is invoked in situations where the watchdog timer is also suppressed. Lines 15-17 define the notifier block for rcu_panic(), and lines 19-22 show the check_cpu_call_init() function that registers this notifier. Once the notifier is registered, any subsequent kernel panic will invoke rcu_panic(), thereby shutting off stall warnings.

Summary

This article has described RCU's CPU stall warning mechanism, which can help locate certain types of bugs. For more information on the tuning and use of this mechanism, please see Documentation/RCU/stallwarn.txt.

Acknowledgments

I am grateful to @@@ for their help in rendering this article human readable.

Legal Statement

This work represents the view of the author and does not necessarily represent the view of IBM.

Linux is a registered trademark of Linus Torvalds.

Other company, product, and service names may be trademarks or service marks of others.

Answers to Quick Quizzes

Quick Quiz 1: How could a hardware failure result in an RCU CPU stall warning?

Answer: If a CPU fails in such a way that it stops executing, but without disturbing the rest of the system, then that CPU will never again report quiescent states to RCU. This is not a particularly probable failure mode, but it really has happened in real life. The RCU CPU stall warning messages were quite helpful in identifying the failed CPU.

Back to Quick Quiz 1.

Quick Quiz 2: Why not instead invoke RCU CPU stall detection from the grace-period-detection kthread?

Answer: In the future, RCU CPU stall detection might also be invoked from the quiescent-state-forcing mechanism that is invoked from the grace-period-detection kthread. However, this will not be likely to replace the other points of invocation, especially the scheduling-clock interrupt. The reason is that the scheduling clock interrupt will continue to run in kernels that are otherwise in very bad shape. Removing RCU CPU stall detection from the scheduling-clock interrupt handler would thus remove all diagnostics from certain types of hangs.

Back to Quick Quiz 2.

Quick Quiz 3: Why wouldn't the current CPU be aware of the current grace period? After all, when printing an RCU CPU stall warning, the current grace period has extended for many seconds, perhaps even minutes!

Answer: Yes, normally the current CPU would be aware of the current grace period: this code is executed only if the current CPU is blocking the current grace period. However, it is worth noting that CPUs in dyntick-idle mode and CPUs that are offline will not normally be aware of the current grace period. So, the question would then be “Why didn't some other CPU report a quiescent state on their behalf?” That question should help direct your debugging efforts.

Back to Quick Quiz 3.

Quick Quiz 4: But what if the ->gp_tasks pointer is NULL on line 11 of rcu_print_detail_task_stall_rnp? Won't that result in a segmentation fault?

Answer: If ->gp_tasks is NULL, then rcu_preempt_blocked_readers_cgp() would have returned false, so that control would never have reached line 11 in the first place.

Back to Quick Quiz 4.

Quick Quiz 5: Why doesn't rcu_print_detail_task_stall() also invoke rcu_print_detail_task_stall_rnp() on all rcu_node structures, rather than just the root and leaves?

Answer: Only the root and leaves can queue tasks that have been preempted within their RCU read-side critical sections. So there is no reason to invoke rcu_print_detail_task_stall_rnp() on anything other than the root and leaves.

Back to Quick Quiz 5.

Quick Quiz 6: Yikes!!! The print_other_cpu_stall() function holds rcu_node structure -> locks while called functions invoke printk(). Is that a recipe for horrendous lock contention or what???

Answer: It would be a recipe for horrendous lock contention if print_other_cpu_stall() was invoked frequently. However, currently the maximum rate at which it can be invoked is once per three seconds, so it should not be a problem. Unless you are dumping your console output over a low-speed serial line, in which case you just might want to speed up your serial console anyway.

Back to Quick Quiz 6.

Quick Quiz 7: Why would the attempt to make other CPUs dump their stacks be subject to failure?

Answer: Because a number of architectures don't implement it.

Back to Quick Quiz 7.

Quick Quiz 8: Given that print_cpu_stall() is printing an RCU CPU stall warning in response to the stall warning time being in the past, how could it possibly be in the future?

Answer: Because some other CPU might have printed a stall-warning message concurrently, which would have caused it to update the stall warning time before we got a chance to check it.

Back to Quick Quiz 8.

Quick Quiz 9: Why doesn't line 12 of check_cpu_stall() need to check that there is an RCU grace period in progress?

Answer: Because the check is implicit in the check of ->qsmask: If there was no grace period, there could not possibly be any bits set in that mask.

Back to Quick Quiz 9.