Linux CPU Scheduling


Today we will discuss about the CPU scheduling in Linux. What do we mean by CPU scheduling, it is sharing the CPU among different processes. Earlier most of the system were uniprocessor and for them CPU scheduling algorithm used were:-

  1. FCFS (First come first scheduled)
  2. SJFS (Shortest Job First)
  3. RR (Round Robin)
  4. Priority Based Scheduling

Nowadays systems are multiprocessor and we need to schedule the CPU in such a way that all the processor execute jobs in balanced way and for this we have new scheduling algorithms. Besides this most of the new scheduling algorithm are priority based.

Now, when we consider the priority based algorithm, then first question arise is: who will assign the priority? we cannot leave this to user; since it may possible that the user processes will have higher priority then the system related processes which is not correct since the interrupt and kernal thread should have higher priority and we also cannot leave the priority decision totally on the system since then it will be impossible for user to specify to the system about importance of tasks. In Linux there are two types of priorities:-

  • Static Priority- assigned by the user using nice() system call.
  • Dynamic Priority – priority assigned earlier is recalculated by the scheduler.

In, Linux processes are classified as:-

  • CPU bound processes (Real Time (RT) processes)
  • IO bound processes (Interactive processes)

Scheduler in Linux system is known as O(1) scheduler, since it takes constant time in all the operations e.g. selecting the process with highest priority, recalculating the priorities and adding the process to queue.It is multi-level priority based, fairer and preemptive(once the time quantum is over it will be switched with the process higher or equal in priority and time quantum assigned to processes is nearly ~100 ms) in nature. There are 140 levels of priority and at each level there is queue for processes.For deciding, the processes with highest priority it uses bitmap mechanism and scan this bitmap array from left and so which ever bit is found to be set first, process will be selected from that particular priority level.
In Linux, typically there are two priority arrays (as discussed earlier the array of 140 priority queue) are maintained, which are known as active and expired. For, processes for whom time quantum is over, is moved to expired array and once the processes are exhausted in active array, pointers are exchanged that is expired array become active and active become expired and all processes are assigned new time quantum (a.k.a epoch).

While recalculating the priority, scheduler gives bonus to processes which varies from 0 to 10, one thing
to be remembered is that higher the value of the priority less is its priority in scheduler term, hence to increase the priority, bonus added which actually means decrementing that many bonus from the current priority and opposite way while decreasing the priority in scheduler context. To make the system fairer, IO bound processes get more bonus then the CPU bound and this bonus calculation is depend on the amount processes have slept and priorities for RT processes remains same. Algorithm for recalculation of priority of process is:

(For detail, data structure and methods of Linux check kernel/sched.c and kernel/sched.h file and man sched_setscheduler.)

bonus = CURRENT_BONUS(p) – MAX_BONUS / 2;
prio = p->static_prio – bonus;
CURRENT_BONUS is defined as follows:

Essentially, CURRENT_BONUS maps a task’s sleep average onto the range 0-MAX_BONUS, which is 0-10. If a task has a high sleep_avg, the value returned by CUR-RENT_BONUS will be high, and vice-versa. Since MAX_BONUS is twice as large as a task’s priority is allowed to rise or fall (MAX_BONUS of 10 means that the priority adjustment can be from +5 to -5), it is divided by two and that value is subtracted from CURRENT_BONUS(p).

Here, p-> static_prio is the static priority defined by the user to a process using nice() system call and by default it is 0.

Till now we have discussed about the priority assignment and its recalculation and data structure and algo for processes priority, but for a scheduler to be fairer to all processor it must need to balance the jobs load among the processors. For this load balancing for SMP system, Linux scheduler invokes the rebalance_tick() function, which is called by scheduler_tick(). rebalance_tick() first updates the current CPU’s load variable, then goes up the CPU’s scheduler domain hierarchy attempting to rebalance.

We can also specify different types of scheduling policies and details for this can be found, if we do “man sched_setscheduler”.
In summary, Linux CPU scheduler is one of the best multi-processor and fairer scheduler available today.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s