Courses/Computer Science/CPSC 457.W2012/Lecture Notes/KernSync

= Kernel Synchronization =

In this session, we will continue our examination of kernel synchronization, particularly dealing with the topic of kernel preemption, how certain synchronization primitives are implemented, and a couple of advanced synchronization techniques. We will finish (time permitting) with some code for (1) timing and (2) drawing the use of locks.

Focus questions:
 * How are spinlocks and semaphores actually implemented inside the kernel?
 * How can we improve system throughput by distinguishing between readers and writers?
 * How can we improve system throughput by allowing a writer into a critical section even though readers may also be present?
 * How does RCU safely provide synchronization without using locks?

= Notes =

The slides for today in both PPTX and PDF.

Concurrency Design Considerations
 * see ULK, pg 217
 * A good OS design goal is to maximize the amount of concurrency in the system. This can be done by
 * enabling most I/O devices to operate concurrently
 * CPUs should be kept as much as possible in a state where they help processes make progress (as opposed to, e.g., busy-waiting)
 * this means that
 * (1) interrupt handlers should be fast and (2) most kernel paths should disable interrupts for a very short period of time
 * spin locks should be avoided

Waking Up Tasks from a RW Semaphore Queue The __rwsem_do_wake function illustrates how the semaphore up (i.e., exit/release) functionality will attempt to pick a waiting writer or let a number of waiting readers into the critical section.

Read-Copy-Update (RCU) The read-copy-update mechanism is on display even in as simple a facility as servicing the getppid(2) system call. When calling the task_tgid_vnr function, getppid has to be careful because current->real_parent might change during the call, so it invokes rcu_read_lock. This function in turn invokes __rcu_read_lock, which maps via a macro to the call to disable kernel preemption, preempt_disable

Timing the Performance Difference in Locked vs. Unlocked Code
 * Code that protects a critical section containing a for loop that increments a global variable. http://www.cpsc.ucalgary.ca/~locasto/teaching/2012/CPSC457/srace.c
 * The same code without locks: http://www.cpsc.ucalgary.ca/~locasto/teaching/2012/CPSC457/urace.c
 * The results:

[michael@xorenduex lockperf]$ time ./saferace real	0m6.166s user	0m6.149s sys	0m0.007s [michael@xorenduex lockperf]$ time ./unrace real	0m0.466s user	0m0.461s sys	0m0.003s [michael@xorenduex lockperf]$

= Reading =

None, work on HW3.

= Supplemental Reading =

ULK: Chapter 5 "Kernel Synchronization" (most of today's material is drawn from this chapter). As a practical illustration of abstract CS concepts, this is probably one of the best chapters in ULK.