Courses/Computer Science/CPSC 457.F2014/Lecture Notes/KernSync
In this session, we will continue our examination of kernel synchronization, particularly dealing with the topic of kernel preemption, how certain synchronization primitives are implemented, and a couple of advanced synchronization techniques. We will finish (time permitting) with some code for (1) timing and (2) drawing the use of locks.
- How are spinlocks and semaphores actually implemented inside the kernel?
- How can we improve system throughput by distinguishing between readers and writers?
- How can we improve system throughput by allowing a writer into a critical section even though readers may also be present?
- How does RCU safely provide synchronization without using locks?
The slides for today PDF.
Concurrency Design Considerations
- see ULK, pg 217
- A good OS design goal is to maximize the amount of concurrency in the system. This can be done by:
- enabling most I/O devices to operate concurrently
- CPUs should be kept as much as possible in a state where they help processes make progress (as opposed to, e.g., busy-waiting)
- this means that
- (1) interrupt handlers should be fast and (2) most kernel paths should disable interrupts for a very short period of time
- spin locks should be avoided
Waking Up Tasks from a RW Semaphore Queue The __rwsem_do_wake() function illustrates how the semaphore up() (i.e., exit/release) functionality will attempt to pick a waiting writer or let a number of waiting readers into the critical section.
Read-Copy-Update (RCU) The read-copy-update mechanism is on display even in as simple a facility as servicing the getppid(2) system call. When calling the task_tgid_vnr() function, getppid() has to be careful because current->real_parent might change during the call, so it invokes rcu_read_lock(). This function in turn invokes __rcu_read_lock(), which maps via a macro to the call to disable kernel preemption, preempt_disable()
Timing the Performance Difference in Locked vs. Unlocked Code
- Code that protects a critical section containing a for loop that increments a global variable. http://www.cpsc.ucalgary.ca/~locasto/teaching/2012/CPSC457/srace.c
- The same code without locks: http://www.cpsc.ucalgary.ca/~locasto/teaching/2012/CPSC457/urace.c
- The results:
[michael@xorenduex lockperf]$ time ./saferace real 0m6.166s user 0m6.149s sys 0m0.007s [michael@xorenduex lockperf]$ time ./unrace real 0m0.466s user 0m0.461s sys 0m0.003s [michael@xorenduex lockperf]$
LKD: Chapter 10: Kernel Synchronization Methods
(not mandatory, FYI if you have access to the ULK book)
ULK: Chapter 5 "Kernel Synchronization" (most of today's material is drawn from this chapter). As a practical illustration of abstract CS concepts, this is probably one of the best chapters in ULK.