Courses/Computer Science/CPSC 457.F2013/Lecture Notes/Timing

Jump to: navigation, search

Process Scheduling and Timing

This session is partly a continuation of the previous session.

In this session, we will finish up our consideration of the main scheduling topic by looking at the Linux 2.6 and 2.4 scheduler code, paying special attention to some helper routines (like those calculating priority or those performing the context switch).

  • scheduler_tick (updating time slice of a "SCHED_NORMAL" process)
  • schedule() (actually perform the selection of the next process)
  • goodness() (in 2.4)
  • recalc_task_prio
  • context_switch, switch_to

Along the way, we will more fully consider some of the support available for timing and the quantum or slice (this is related to our Measurement theme). Timing is an important topic in operating systems because a reliable signal source is needed to synchronize and kickstart various procedures. We will consider the hardware support necessary for this functionality to exist.

Focus Question: How can we understand what the conditions are for the scheduler to execute? This is a subtopic of kernel control flow (which isn't like the traditional sequential control flow of many userland programs).


  • finish overview of schedule() routines
  • overview of scheduling helper routines
  • Clock Hardware
  • Clock Software
  • Kernel support for time management


We started the class by highlighting two major principles or themes:

  • Policy vs. Mechanism (an expression of configuration vs. functionality)
  • The idea of kernel control flow as many asynchronous control paths

We then reviewed the main code of both the 2.4 and the 2.6 scheduler.

The 2.6 schedule() call chain begins with a call to schedule(). This function is invoked from a number of places in the kernel whenever a scheduling decision has to be made. This is typically upon checking a flag that indicates the kernel needs to reschedule, because process priorities have changed enough to warrant it. Checking this flag usually occurs when returning from a system call or servicing an interrupt --- we (the kernel) are already executing, so let's take advantage and determine if we should let another process run.

The main "work" is to select the next task via a call (at line 5453) to the routine pick_next_task; this fairly simple function picks the highest priority task on the system from the runqueue priority buckets. The execution time of this function is bounded by a constant amount of time - it does not depend on how many processes are actually runnable or extant on the system. In the case of the CFS, pick_next_task delegates to the pick_next_task_fair routine.

Once the next task is selected, the scheduler has to initiate the actual context switch to the next process by calling the context_switch function. This function relies on the switch_to assembly macro to actually swap in the saved CPU context from the about-to-be-executed process (saved from the time it was previously taken off the CPU).

Voluntarily giving up the CPU

There was a question on how to voluntarily give up the CPU; a process can do this by invoking the yield(2) system call. Note the invocation of the schedule() function to select another process (although it is still possible that the yield'ing process may be selected next).

Timing and Scheduling

NB: I need to fix the links here... -MEL

The sched_info data structure keeps track of, on a per-process basis, how much CPU time a process has received. The sched_entity data structure also keeps track of timing stats; a field of this type is present in the task_struct (struct sched_entity se). The sched_info data structure is conditionally compiled into the task_struct; see lines 1079 to 1081.

The sched_class data structure represents a collection of attributes and function pointers for accomplishing the different scheduler-related activities for different classes (types) of processes under a particular scheduling approach (e.g., normal, realtime).

On the topic of timing, the sched_clock_tick function updates the "current" time by a call chain involving ktime_get, which involves a call to ktime_get_ts which eventually resolves (on x86) to executing the rdtsc assembly instruction via the __native_read_tsc inline function.

Notice how we cross from the generic "get kernel time" notion to the machine-specific functionality surrounding the hardware clock.

Advanced Topics We touched on several other pieces of functionality:

  • real time scheduling (as faked in Linux)
  • sys_nice
  • sched_yield (voluntarily relinquish CPU)

We will not cover more advanced topics like:

  • balancing run queues across processors or cores
  • the full set of scheduling-related system calls
    • getpriority() and setpriority()
    • sched_getaffinity(), sched_setaffinity()
    • sched_getscheduler, sched_setscheduler
    • sched_getparam, sched_setparam
    • sched_get_priority_min, sched_get_priority_max
    • sched_rr_get_interval

Scribe Notes


Reference material for today's session about timing, timers, and clocks.

  • MOS: 5.5 "Clocks"