You are on page 1of 10

S yn ch ro n i za ti o n

M e ch a n i sm s
e
i
g
h
t
1
E x e rci se G o a l: You w ill add a new synchronization m echanism to the kernel.
This w ill require you to create a design for a new m echanism and then to add
its im plem entation to the kernel.
I n t r o d u c t i o n
Every m odern kernel provides one or m ore synchronization prim itives to enable
application program m ers to coordinate the behavior of m ultithreaded softw are.
Classic U N IX system s did not provide m echanism s for ne-grained m ultithread
operation; the prim ary m eans for synchronizing processes w as pipes (and les).
System V U N IX provided sem aphores as part of the shared m em ory im plem en-
tation; these prim itives are defined in PO SIX and hence are supported by
Linux.
Blocking a Task
As a task (a process executing in kernel space) runs, it m ight have to w ait for
som e condition to becom e true before it can continue. For exam ple, it m ight
w ait for an I/O operation to com plete (an interrupt to occur). W hen a task de-
cides to block itself, it changes its ow n state, by changing the current->state
eld in its descriptor from TASK_RUNNING to either TASK_INTERRUPTIBLE
or TASK_UNINTERRUPTIBLE. Then it enqueues itself on a list of blocked
processes and invokes the scheduler to run a different task. A task in the
TASK_INTERRUPTIBLE state can be reactivated by a signal or an interrupt,
w hereas a task in the TASK_UNINTERRUPTIBLE state can be reaw akened
only if an interrupt occurs and changes som ething in the m achine state so that
the task can run again. For exam ple, w hen a task starts an I/O operation, the
device driver code places the task in the TASK_UNINTERRUPTIBLE state. The
task can be aw akened only by the interrupt that indicates that the I/O operation
has com pleted.
As m entioned, w hen a task blocks itself, the kernel code links the process into
a list of tasks, w aiting for the given condition to becom e true. Thus you can
think of each condition on w hich a task could block as defining a list of tasks.
W henever the corresponding condition becom es true, then one or m ore (de-
pending on the sem antics of the condition) of the blocked tasks in the list be-
com e unblocked. For exam ple, the classic D ijkstra sem aphore (see Chapter 8 in
N utt [2000]) is a form of condition that can block a task. A binary sem aphore is
an unsigned integer variable that has an initial value 1. For a sem aphore, S, its
value can be changed only by one of tw o functions, P(S) or V(S). Think of a
sem aphore as being like a C+ + class or an abstract data type. That is, it has pri-
vate data to represent the unsigned integer value and tw o public functions on
its interface. The binary sem aphore is the sim plest form of sem aphore; it can
take on only values of 0 or 1. By contrast, general sem aphores can take on ar-
bitrary non-negative integer values. For a binary sem aphore, w hen P(S) is called
and S has the value 1, then S is decrem entedchanged to 0and the calling
process continues to execute. If S has the value 0, then the calling process
is blocked until S is increm ented to 1. The V(S) operation increm ents the
sem aphore.
The paradigm for using a sem aphore is for a process to call P(S) before it be-
gins to execute a critical section, code that could cause a race condition of the
type described in Exercise 6. W hen the process com pletes executing the critical
section code, it calls V(S) to increm ent the sem aphore. N ote that if one process
is in a critical sectionS w ill have been set to 0then another process that at-
tem pts to enter its critical section by calling P(S) w ill be blocked until the first
process executes V(S). The P() and V() functions are them selves critical
sections, m eaning that w hen they are being executed, they cannot be
interrupted.
2 S yn ch ro n i za ti o n M e ch a n i sm s
You can im plem ent a sem aphore by creating a kernel abstract data type that
has an unsigned integer variable for the value of the sem aphore and the tw o
publicly accessible functions, P() and V(). You also can keep track of the tasks
that are blocked on the sem aphore by putting each blocked task in a list. If the
value of the sem aphore is 1, then the list is em pty. If one or m ore tasks are
w aiting for S, then each of these tasks is in the abstract data types internal task
list. W hen V(S) is called, exactly one task is rem oved from the list, thereby leav-
ing the others still blocked. If no tasks are w aiting w hen V(S) is called, then the
value of the sem aphore is changed to 1 w ith no further action. M ore details on
how this w orks are provided after a discussion of kernel w ait queues.
Wait Queues
The situation w hereby a task/process m ust w ait for som e condition to becom e
true occurs in m any places in the kernel. This is another w ay of saying that the
kernel has m any critical sections in it. Therefore Linux provides a single, funda-
m ental abstract data structure that can be used to m aintain lists of tasks w aiting
on a condition, called w ait queues. A wait queue is a circular list of elem ents in
w hich each elem ent contains a pointer to a struct task_struct process de-
scriptor. These list im plem entations are not privatethe kernel is im plem ented
as a single m onolithic block of code rather than as a collection of m odules.
H ow ever, the intent is that tasks be inserted into a w ait queue only by using
__add_wait_queue() and rem oved by using __remove_wait_queue().
Follow ing are the relevant definitions for a w ait queue, taken from
include/linux/wait.h for the data structure and m acro and from include/
linux/sched.h for the functions.
#dene WAIT_QUEUE_HEAD(x) ((struct wait_queue *)((x)-1))
struct wait_queue {
struct task_struct * task;
struct wait_queue * next;
};
extern inline void __add_wait_queue(
struct wait_queue ** p, struct wait_queue * wait)
{
wait->next = *p ? : WAIT_QUEUE_HEAD(p);
*p = wait;
}
I n tro d u cti o n 3
8
extern inline void __remove_wait_queue(
struct wait_queue ** p, struct wait_queue * wait)
{
struct wait_queue * next = wait->next;
struct wait_queue * head = next;
struct wait_queue * tmp;
while ((tmp = head->next) != wait) {
head = tmp;
}
head->next = next;
}
This code is very com pact. It does no error checking and uses a m inim um
am ount of instructions to m aintain the list. The __add_wait_queue() code
sim ply places an elem ent of type struct wait_queue * at the head of the
w ait queue (the argum ent of type struct wait_queue **). To enqueue a task,
T, you create a w ait queue instance, newInstance, set the task field,
newInstance->task to reference the task, and then call __add_wait_
queue(targetWaitQueue, newInstance). The __remove_wait_queue()
function w orks sim ilarly to rem ove an elem ent from the w ait queue. Com m ents
in the source code w arn kernel program m ers to be very careful about m anipu-
lating the lists w ithout using these add and rem ove functions.
Using Wait Queues
The code in kernel/sched.c provides another set of functions that use w ait
queues for com m on form s of task m anagem ent. The functions sleep_on()
and interruptible_sleep_on() place the calling task on the specified w ait
queue. The rst function sets current->state to TASK_UNINTERRUPTIBLE,
and the second sets it to TASK_INTERRUPTIBLE.
#dene SLEEP_ON_VAR \
unsigned long ags; \
struct wait_queue wait;
#dene SLEEP_ON_HEAD \
wait.task = current; \
write_lock_irqsave(&waitqueue_lock,ags); \
__add_wait_queue(p, &wait); \
4 S yn ch ro n i za ti o n M e ch a n i sm s
write_unlock(&waitqueue_lock);
#dene SLEEP_ON_TAIL \
write_lock_irq(&waitqueue_lock); \
__remove_wait_queue(p, &wait); \
write_unlock_irqrestore(&waitqueue_lock,ags);
void interruptible_sleep_on(struct wait_queue **p)
{
SLEEP_ON_VAR
current->state = TASK_INTERRUPTIBLE;
SLEEP_ON_HEAD
schedule();
SLEEP_ON_TAIL
}
void sleep_on(struct wait_queue **p)
{
SLEEP_ON_VAR
current->state = TASK_UNINTERRUPTIBLE;
SLEEP_ON_HEAD
schedule();
SLEEP_ON_TAIL
}
These functions are identical, except for the value that they assign to
current->state. N otice that they also do not contain m any statem ents (your
solution to this exercise w ill not contain m any lines of code either). The
SLEEP_ON_HEAD m acro adds the task to the w ait queue. Then they call the
scheduler, w ith the calling task being blocked (not eligible to be scheduled).
W hen som e other task aw akens the sleeping task w ith wake_up() or
wake_up_interruptible() (discussed next), the new ly aw akened task re-
m oves itself from the w ait queue by using the SLEEP_ON_TAIL m acro. The
context in w hich the sleep_on() function m ay be called could allow interrup-
tion. Therefore the m acro uses a lock for the critical section related to m anipu-
lating a w ait queue.
I n tro d u cti o n 5
8
H ere are the corresponding functions to w ake up the task, both based on
__wake_up().
#dene wake_up(x) __wake_up((x),TASK_UNINTERRUPTIBLE | \
TASK_INTERRUPTIBLE)
#dene wake_up_interruptible(x) __wake_up((x),TASK_INTERRUPT-
IBLE)
void __wake_up(struct wait_queue **q, unsigned int mode)
{
struct task_struct *p;
struct wait_queue *head, *next;
if (!q)
goto out;
/*
* this is safe to be done before the check because it
* means no deference, just pointer operations.
*/
head = WAIT_QUEUE_HEAD(q);
read_lock(&waitqueue_lock);
next = *q;
if (!next)
goto out_unlock;
while (next != head) {
p = next->task;
next = next->next;
if (p->state & mode) {
/*
* We can drop the read-lock early if this
* is the only/last process.
*/
if (next == head) {
read_unlock(&waitqueue_lock);
wake_up_process(p);
goto out;
}
wake_up_process(p);
6 S yn ch ro n i za ti o n M e ch a n i sm s
}
}
out_unlock:
read_unlock(&waitqueue_lock);
out:
return;
}
In Exercise 6, you studied the System V shared m em ory im plem entation. In
your solution to the exercise problem , you had to change all array references of
the form shm_seg[] to a list reference. In the functions findkey() in
ipc/shm.c, you undoubtedly observed the follow ing code segm ent.

for (id = 0; id <= max_shmid; id++) {


while ((shp = shm_segs[id]) == IPC_NOID)
sleep_on (&shm_lock);
if (shp == IPC_UNUSED)
continue;
if (key == shp->u.shm_perm.key)
return id;
}

And in newseg() in ipc/shm.c, you saw the follow ing com plem entary code
segm ent.

found:
shp = (struct shmid_kernel *) kmalloc ();
if (!shp) {
shm_segs[id] = (struct shmid_kernel *) IPC_UNUSED;
wake_up (&shm_lock);
return -ENOMEM;
}

N ow you can understand w hat is going on in this code. A race condition can
occur w hen processes attem pt to find a new key and is detectable by the
IPC_NOID value in the shm_segs[id] variable. If the code detects the race,
then the loser of the race puts itself to sleep on the shm_lock w ait queue, to
be aw akened in the newseg() function by the wake_up(&shm_lock) func-
tion call.
I n tro d u cti o n 7
8
P r o b l e m S t a t e m e n t
Part A
D esign and im plem ent a new kernel synchronization prim itive that w ill allow
m ultiple processes to block on an event until som e other process signals the
event. W hen a process signals the event, all processes that are blocked on the
event are unblocked. If no processes are blocked on an event w hen it is sig-
naled, then the signal has no effect. Im plem ent the follow ing new system calls.
int evntopen (int); // Creates an event, returning ID
// arg=0 => new event; otherwise, uses
// an existing event; error accordingly
int evntclose(int); // Destroy an event
int evntwait(int eventNum); // Blocks thread until the event is
// signaled
void evntsig(int eventNum); // Unblocks all waiting threads
// If no processes are blocked,
// ignored
Part B
W rite an application program to test your new kernel functions. Your test pro-
gram should show that the kernel functions w ork for the usual cases, for exam -
ple, w ith one process w aiting, and also for boundary conditions such as
no tasks w aiting w hen evntsig() is called,
m ultiple tasks w aiting w hen evntsig() is called,
m ultiple events open at one tim e, and
processes w aiting w hen evntclose() is called.
A t t a c k i n g t h e P r o b l e m
You are to design a com pletely new facility to add to the kernel. W hen you add
the facility, you w ill probably need to create a new im plem entation file (and
change Makele so that it com piles your new le). You also likely w ill need to
change a few parts of the existing kernel, such as the initialization code. To do
this, you should w rite a new evnt_init() function that can be called w hen the
system is booted:
int evnt_init(int); // Initializes events__not a sys call
8 S yn ch ro n i za ti o n M e ch a n i sm s
You also need to change the system initialization code to call this new
evnt_init() function. (The initialization routine has an underbar betw een evnt
and init because it is nam ed by using the internal kernel function nam ing con-
vention rather than the system call nam ing convention.) You can use this func-
tion to set up any internal structures that you require in your solution.
Begin by thinking carefully about the data structures that you w ill need to solve
the problem . Your system w ill need to support several different events, so you
need a set of descriptor data structures, one for each event. If you solved
Exercise 6, then recall that each shared m em ory segm ent had its ow n descrip-
tor; you w ill need the analog of a shared m em ory descriptor for an event. You
can decide if you w ant a list or an array of such descriptors. Then your
evnt_init() function can initialize that data structure.
If you use an array, you nevertheless should dynam ically allocate the space for
each entry (again, see the shared m em ory segm ent descriptors). This w ill re-
quire that you use kmalloc() and kfree(), tw o functions that allocate kernel
m em ory dynam ically.
You m ight w ant to w ork at the level of w ait queues and the low -level m anipu-
lation routines. Your evntwait() code then m ight have a skeleton such as the
follow ing.
int evntwait(int eventNum)
{

->state = TASK_INTERRUPTIBLE;
add_wait_queue();
schedule();

}
See the discussion in this exercises introduction.
Sim ilarly, your evntsig() function m ight have a skeleton such as that show n for
the wake_up() im plem entation.
H ow ever, if you decide to use the interruptible_sleep_on()/sleep_on() and
wake_up()/wake_up_interruptible() internal kernel functions, then your
evntwait() function m ight have a skeleton such as this.
int evntwait(int eventNum)
{
A tta ck i n g th e P ro b le m 9
8

interruptible_sleep_on();

}
Sim ilarly, your evntsig() function m ight have a skeleton such as this.
int evntsig(int eventNum)
{

wake_up_interruptible();

}
1 0 S yn ch ro n i za ti o n M e ch a n i sm s

You might also like