Linux process priority and relation to top's PRI, NI

On 2017/07/01 at 13:36

The process's priority in Linux is defined as a value in the range of 0~139 (the lower number means higher priority). Priority between 0~99 is for realtime process and 100~139 is for normal process. The default priority defined in the kernel is as follow:

sched/prio.h source

#define MAX_NICE    19
#define MIN_NICE    -20
#define NICE_WIDTH  (MAX_NICE - MIN_NICE + 1)

#define MAX_USER_RT_PRIO    100
#define MAX_RT_PRIO         MAX_USER_RT_PRIO

#define MAX_PRIO            (MAX_RT_PRIO + NICE_WIDTH)
#define DEFAULT_PRIO        (MAX_RT_PRIO + NICE_WIDTH / 2)

Which means the normal process's default priority is 120.

However when reading the output of htop/top, I found the value of the PRI column for normal process is defined as PRI = 20 + NI and realtime process is defined as PRI = -1 - prio. For example, the following program will set once thread as realtime process with priority = 56, one thread as normal process with priority = 120 + (-20) = 100 and one thread as normal process with priority = 120 + 19 = 139:

#include <bits/stdc++.h>
#include <pthread.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/syscall.h>
#include <unistd.h>

using namespace std;

int main()
{
    thread t1([](){
                /* Set this thread as real time process with priority 56 */
                struct sched_param param;
                param.sched_priority = 56;

                pthread_setschedparam(pthread_self(), SCHED_FIFO, &param);

                while(true){sleep(2);}
            });
    thread t2([](){
                /* Set the nice value to -20 */
                setpriority(PRIO_PROCESS, syscall(SYS_gettid), -20);

                while(true){sleep(2);}
            });
    thread t3([](){
                /* Set the nice value to 19 */
                setpriority(PRIO_PROCESS, syscall(SYS_gettid), 19);

                while(true){sleep(2);}
            });

    t1.join();
    t2.join();
    t3.join();
}

And the output of htop is as follow:

process_priority_htop

After some googling, I found the behavior is defined in the manpage of PROC(5). online manpage link

(18) priority %ld (Explanation for Linux 2.6) For processes running a real-time scheduling policy (policy below; see sched_setscheduler(2)), this is the negated scheduling priority, minus one; that is, a number in the range -2 to -100, corresponding to real-time priorities 1 to 99. For processes running under a non-real-time scheduling policy, this is the raw nice value (setpriority(2)) as represented in the kernel. The kernel stores nice values as numbers in the range 0 (high) to 39 (low), corresponding to the user-visible nice range of -20 to 19.

top/htop is just reading the value from /proc/[pid]/stat, so everything is explainable.

Comments