ck kernel wiki

The -ck patchset includes support for all the scheduling policies that mainline supports, and includes 2 unique extra scheduling policies. This document describes each and their use.

The default scheduling policies in mainline and -ck and their definition are as follows:

REALTIME POLICIES[]

These policies either require superuser privilege (ie. run as root) or realtime capabilities for unprivileged users in the form of a PAM module. They include SCHED_FIFO and SCHED_RR.

SCHED_FIFO[]

These processes schedule according to their realtime priority which is unrelated to the nice value. The highest priority process runs indefinitely, never releasing the cpu except to an even higher priority realtime task or voluntarily. Only proper realtime code should ever use this policy as the potential for hardlocking a machine is high if the process runs away. Audio applications for professional performance such as jack use this policy.

SCHED_RR[]

These run similar to SCHED_FIFO except that if more than one process has the same realtime priority, they will run for short periods each and share the cpu.

NON REALTIME POLICIES[]

These policies do not require special privileges to use and include SCHED_NORMAL and SCHED_BATCH in mainline and -ck. -ck also includes two extra unprivileged policies, SCHED_ISO and SCHED_IDLEPRIO.

SCHED_NORMAL[]

This is how most normal applications are run. The amount of cpu each process consumes and the latency it will get is mostly determined by the 'nice' value. They run for short periods and share cpu amongst all other processes running with the same policy, across all nice values. Known as SCHED_OTHER in most of the rest of the world, including glibc headers as per POSIX.1.

SCHED_BATCH[]

Similar to SCHED_NORMAL in every way except that specifying a task as batch means you are telling the kernel that this process should not ever be considered an interactive task. This means that you want these tasks to get the same share of cpu time as the same nice level SCHED_NORMAL tasks would have, but you do not care what latency they have.

SCHED_ISO[]

Unique to -ck this is a scheduling policy designed for pseudo-realttime scheduling without requiring superuser privileges (unlike SCHED_RR and SCHED_FIFO). When scheduled SCHED_ISO, a task can receive very low latency scheduling, and can take the full cpu like SCHED_RR, but unlike the realtime tasks they cannot starve the machine as an upper limit to their cpu usage is specified in a tunable (see below). It is designed for realtime like behaviour without risk to hanging for programs not really coded safely enough to be run realtime such as ordinary audio and video playback software. SCHED_ISO does not take a realtime priority, but nice levels like other normal tasks, although the nice value is largely ignored except when the task uses more than its cpu limit.

SCHED_IDLEPRIO[]

also unique to -ck this is a scheduling policy designed for tasks to purely use idle time. This means that if anything else is running, these tasks will not even progress at all. This policy is useful for performing prolonged computing tasks such as distributed computing clients (setiathome, foldingathome, etc) and prevents these tasks from stealing away any cpu time from other applications. Also it can be useful for conducting large compilations of software without affecting other tasks. These tasks are also flagged in -ck for less aggressive memory usage and disk read bandwidth, but these affects are not potent, and if the task uses a lot of memory and disk it will be noticeable. SCHED_IDLEPRIO takes a nice value. However this nice value only determines the cpu distribution between all idleprio tasks, allowing many idleprio tasks to be running with different nice values. This might be desirable, for example, when using a distributed computing client at nice 19 and compiling software at nice 0 when both are SCHED_IDLEPRIO.

Specifying policies[]

Usually, choosing the scheduling policy is a decision made by the application itself and is performed using the sys_setscheduler function within the code. However it may be desirable to specify the policy manually when starting an application, especially since support for the -ck specific policies within applications is unusual.

SCHED_ISO has a unique starting mechanism in -ck. If you start an application that wants realtime privileges (like jackd) but you do not have realtime privileges, the -ck scheduler will detect the request and automatically give the task a SCHED_ISO policy. Applications like cdrecord and jackd commonly would benefit from this.

Manually specifying the policy when starting an application requires the use of some form of command line task to spawn the application, or a wrapper to ensure it starts with the policy desired. Once started as a different scheduling policy, in order to prevent the application unwittingly dropping processes back to SCHED_NORMAL, -ck disallows changing the policy back without superuser privileges. The only change that is allowed without superuser privileges is to decrease the policy to a lower one where the hierarchy is as follows:

SCHED_ISO > SCHED_BATCH > SCHED_IDLEPRIO

Schedtool[]

schedtool provides support for the extra -ck policies and makes starting a task with any policy relatively straight forward.

schedtool -I -e amarok

Will start the application amarok as SCHED_ISO.

schedtool -D -e setiathome

Will start setiathome as SCHED_IDLEPRIO

Any threads or processes spawned by these tasks will also inherit the same policy.


If the task is already running, you can determine the pid number by using 'top' or 'ps' and then change the policy on the fly using the value of the pid number as follows:

schedtool -D $pid

Or:

schedtool -D `pidof cpu_hog`

Note that this only changes one process, and if there are multiple processes or threads, each needs to be changed separately. Make sure you actually check the virtual pid number of each thread when there are threads as they do NOT normally show up in top or ps. This is why changing the policy after the task is already running is unlikely to achieve the desired result. You can specify thread mode in top, or alternatively this useful global summary of everything using ps you may find helpful:

ps -wweALo pid,spid,user,priority,ni,pcpu,vsize,time,args

The second column will show the thread's virtual pid which is what needs to be modified with schedtool.

Toolsched[]

toolsched is a wrapper for the extra -ck policies that is designed to ensure that certain tasks are always started under the policy of your choice. It requires that schedtool is also installed to work.

In order to use toolsched, you need to install toolsched and create one separate symbolic link for every application that you want wrapped. To do this, you need to know the PATH of your command and find the first entry in the path, or specify one for yourself to somewhere that doesn't require special privileges to write to. Generally ~/bin is a good place for this, but having a separate toolsched PATH may make it easier to keep track of special wrappers. To do this, first install the toolsched scripts (usually in /usr/bin) and alter your shell's resource file (usually .bashrc) by adding the following:

PATH=$HOME/.toolsched:$PATH
EXPORT PATH

Then create the toolsched directory

mkdir ~/.toolsched

Then add each symbolic link to the script of your choice, D for idleprio and I for iso.

ln -s /usr/bin/toolsched.d ~/.toolsched/setiathome
ln -s /usr/bin/toolsched.i ~/.toolsched/growisofs

Then when you next start your shell these symlinks should automatically set the policy each time you start these commands. Note the wrappers are transparent to other applications, and even if you are not running a -ck kernel will still work.

SCHED_ISO Tunables[]

The stable release of -ck currently includes only one tunable for SCHED_ISO. That value is kept in

/proc/sys/kernel/iso_cpu

and is the percentage cpu maximum over a rolling 3 second average that SCHED_ISO tasks are scheduled as pseudo realtime. Values from 0 to 100 are valid. At 100 the soft realtime nature is abolished and SCHED_ISO tasks behave as SCHED_RR tasks. This tunable is set to 80 by default on -ck and to 0 on -cks. It is enforced on a per-cpu basis. This value can be edited by echoing a value to it

echo 100 > /proc/sys/kernel/iso_cpu

or by adding an entry into /etc/sysctl.conf

kernel.iso_cpu=30