You are on page 1of 25

Chapter 1-3 :

Q. What is Kernel in Linux ?

Ans : The Kernel is the core program that runs programs and manages hardware devices, such as disks
and printers. It execute the commands which provide by environment. Kernel provides an interface
between shell and hardware.

Q2. Define the features of Linux ?

Ans : 1. Multi-tasking :

Linux supports true preemptive multi-tasking. All processes run entirely independently of each
other. No process needs to be concerned with making processor time available to other processes.
Multi-user access :

A multi-user system is a computer that is able to concurrently and independently execute several
applications belonging to two or more users.

Multi-processing :

Linux also runs on multi-processor architectures. This means that the O. S. can distribute several
applications across several processors.

Architecture independence (Portability) :

Linux runs on several hardware platforms, from the Amiga to the PC to DEC Alpha workstations.
Such hardware independence is achieved by no other serious O. S.

Demand load executables :

Only those parts of a program actually required for execution are loaded into memory. When a new
process is created using fork(), memory is not requested immediately, but instead the memory for
the parent process is used jointly by both processes.

Paging :

Linux provide a very important concept of paging. Despite the best efforts to use physical memory
efficiently, it can happen that the available memory is fully taken up.

Dynamic cache for hard disk :

Linux dynamically adjusts the size of cache memory in use to suit the current memory usage
situation.

Shared Libraries :

Libraries are collections of routines needed by a program for processing data. There are a number
of standard libraries used by more than one process at the same time.

Memory protected mode :

Linux uses the processor’s memory protection mechanisms to prevent the process from accessing
memory allocated to the system kernel or other processes.

Support for national keyboards and fonts :

Under Linux, a wide range of national keyboards and character sets can be used : for example,
the Latin1 set defined by the International Organization for Standardization (ISO) which also
includes European special characters.

Different file systems :


Linux supports a variety of file systems. The most commonly used file system at present is the
Second Extended (Ext2) File system. This supports filenames of up to 255 characters and has a
number of features making it more secure than conventional Unix file systems.

Q. Define the file structure of Linux ?

Ans: The file structure of any O. S. is includes the arrangement of files & folders. Linux organizes files
into a hierarchically connected set of directories. Each directory may contain either files or other
directories. Because of the similarities to a tree, such a structure is often referred to as a tree
structure and also called parent-child structure.

The Linux file structure branches into several directories beginning with a root directory, /. Within
the root directory several system directories contain files and programs that are features of the
Linux system. These system directories as follows :-
/ root : Begins the file system structure, called the root
/fs : The virtual file system interface in in the fs directory. The
implementations of the various file systems supported by LINUX are
held in the respective subdirectories.
/home : Contains users’ home directories
/bin : Holds all the standard commands and utility programs
/usr : Holds those files and commands used by the system; this directory
breaks down into several sub-directory
/usr/bin : Holds user-oriented commands and utility programs
/usr/sbin : Holds system administration commands
/usr/lib : Holds libraries for programming languages
/usr/doc : Holds Linux documentation
/usr/man : Holds the online manual Man files
/usr/spool : Holds spooled files, such as those generated for printing jobs and
network transfers
/sbin : Holds system administration commands for booting the system
/var : Holds files that vary, such as mailbox files
/dev : Holds file interfaces for devices such as the terminals and printers
/etc : Holds system configuration files and any other system files.
/init : contains all the functions needed to start the kernel. Like start_kernel().
/net : contains the implementations of various network protocols and the
code for sockets to the UNIX and Internet domains.
/arch : architecture -dependent code is held in the subdirectories of arch/
/mm : contains Memory management sources for the kernel.

Q2. Define the Kernel Architecture ?

Ans : Most Unix kernels are monolithic : each kernel layer is integrated into the whole kernel program
and urns in Kernel Mode on behalf of the current process. Microkernel operating systems demand
a very small set of functions from the kernel, generally including a few synchronization primitives, a
simple scheduler, and an interprocess communication mechanism. Although Microkernels oriented
O. S. are generally slower than monolithic ones, since the explicit message passing between the
different layers of the O. S. might have some theoretical advantages over monolithic ones.

Define the process and task_structure ?


Ans : The concept of a process is fundamental to any multiprogramming operating system. A process is
usually defined as an instance of a program in execution; thus, if 16 users are running vi at once,
there are 16 separate processes ( although they can share the same executable code).

Each & every process have some unique information, which store in task_struct type process
descriptor, which is the object of task_struct.

Struct task _struct


{
volatile long state;
long counter;
long priority;
unsigned long signal;
unsigned long blocked;
unsigned long flags;
int errno;
int debugreg[8];
struct task_struct *next_task;
struct task_struct *prev_task;
struct mm_struct mm;
int pid, uid,gid;
struct fs_struct fs;
long utime, stime, cutime, cstime, start_time;

state field of the task_struct describes what is currently happening to the process. The following
are the possible process states :

TASK_RUNNING : The process is either executing on the CPU or waiting to be executed.


TASK_INTERRUPTIBLE : The process is suspended (sleeping) until some condition becomes
true. Raising a hardware interrupt, releasing a system resource the process is waiting for, or
delivering a signal are examples of conditions that might wake up the process, that is put its state
back to TASK_RUNNING.
TASK_UNINTERRUPTIBLE: In this state process is uninterruptible of any hardware interrupt, or
any signal.
TASK_STOPPED: Process execution has been stopped : the process enters this state after
receiving a SIGSTOP, SIGTSTP, SIGTTIN, or SIGTTOU signal.
TASK_ZOMBIE : Process execution is terminated, but the parent process has not stopped. The
kernel cannot discard the data contained in the dead process task_struct because the parent could
need it.

The counter variable holds the time in ‘ticks’ for which the process can still run before a mandatory
scheduling action is carried out. The schedular uses the counter value to select the next process.

The priority holds the priority of a process.

The signal variable contains a bit mask for signals received for the process.

The bolcked contains a bit mask for all the signals the process plans to handle later.

flags contains the system status flags.

errono contains the error code if generated.

debugreg[8] assigns the debugger to that err code.

*next_task and *prev_task all processes are entered in a doubly linked list with the help of these
two components.

mm_struct mm the data for each process needed for memory management are collected,
mm_struct store those data.

Every process has its own process ID number , pid, user ID, uid, goup ID, gid.

The file-system-specific data are stored in fs_struct fs.

The utime and stime variables hold the time the process has spent in User Mode and System
Mode, cutime and cstime contain the totals of the corresponding times for all child processes,
start_time contains the time at which the current process was generated.

Q. What is process table in Linux kernel ?

Ans : Every process occupies exactly one entry in the process table. In Linux, this is statically organized
and restricted in size to NR_TASKS. NR_TASKS denotes the maximum number of process.
Struct task_struct *task [NR_TASKS] ;

In older versions of the Linux kernel, all the processes present could be traced by searching the
task[ ] process table for entries. In the newer versions this information is stored in the linked lists
next_task and prev_task, which can be found in the task_struct structure. The external variable
init_task points to the start of the doubly linked linked circular list.

The entry task[0] has a special significance in Linux. Task[0] is the INIT_TASK mentioned above,
which is the first to be generated when the system is booted and has something of a special role to
play.

Q. What is inode? How it is used for storage of regular files?

Ans : All enitities in Linux are trated as files. The information related to all these files (not the contents )
is stored in an Inode Table on the disk. For each file, there is an inode entry in the table. Inodes
contain information such as the file’s owner and access rights.

The inode structure-


struct inode
{
dev_t idev;
unsigned long i_ino;
umode_t i_mode;
uid_t i_uid;
gid_t i_gid;
off_t i_size;
time_t i_mtime;
time_t i_atime;
time_t i_ctime;
}

The component :

i_dev is a description of the device on which the file is located.


i_ino identifies the file within the device.
i_mode mode which file is open
i_uid user id
i_gid group id
i_size the size in bytes
i_mtime times of the last modification
i_atime times of last access
i_ctime time of last modification to the inode

The component dev, ino pair thus provides an identification of the file which is uniquely identified
the file in entire file system.

Q. What is Interrupts ? Define the slow and fast interrupts ?

Ans: Whenever a special signal is generated by any hardware is called interrupt. Interrupts are used to
allow the hardware to communicate with the O. S.

There are two types of interrupt in Linux slow and fast-

Slow Interrupts :

Slow interrupts are the usual kind. After a slow interrupt has been processed, additional activities
requiring regular attention are carried out by the system - for example, the timer interrupt.

Fast Interrupts :

Fast interrupts are used for short, less complex tasks. While they are being handled, all other
interrupts are blocked, unless the handling routine involved explicitly enables them. A typical
example is the keyboard interrupt.
Q. What is the Booting process of Linux system?

Ans : There is something magical about booting a Linux system. First of all LILO ( The LInux LOader )
finds the Linux kernel and loads it into memory. It then begins at the entry point start : as the
name suggests, this is assembler code responsible for initializing the hardware. Once the essential
hardware parameters have been established, the process is switched into Protected Mode by
setting the protected mode bit in the machine status word. Then initiates a jump to the start
address of the 32 bit code for the actual operating system kernel and continues from startup_32: .
Once initialization is complete, the first C function start_kernal() is called.

The first saves all the data the assembler code has found about the hardware up to that point. All
areas of the kernel are then initialized. The process now running is process 0. It now generates a
kernel thread which executes the init() function.

The init() function carries out the remaining initialization. It starts the bdflush and kswap daemons
which are responsible for synchronization of the buffer cache contents with the file system and for
swapping.

Then the system call setup is used to initialize file systems and to mount the root file system. Then
an attempt is made to execute one of the programs /etc/init, /bin/init or /sbin/init. These usually
start the background processes running under Linux and make sure that the getty program runs on
each connected terminal - thus a user can log in to the system.

If none of the above-mentioned programs exists, an attempt is made to process /etc/rc and
subsequently start a shell so that the superuser can repair the system.

Q Define the system calls getpid, nice, pause, fork, execve, exit, wait.

Ans : getpid:

The getpid call is a very simple system call - it merely reads a value from the task structure and
returns it :

asmlinkage int sys_getpid(void)


{
return current->pid;
}

nice :

The system call nice is a little more complicated : nice expects as its argument a number by which
the static priority of the current process is to be modified. Only the superuser is allowed to raise
his/her own priority. Note that a large argument for sys_nice() indicates a lower priority.

pause :

A call to pause interrupts the execution of the program until the process is reactivated by a signal.
This merely amounts to setting the status of the current process to TASK_INTERRUPTIBLE and
then calling the scheduler. This results in another task becoming active.

fork:

The system call fork is the only way of starting a new process. This is done by creating a identical
copy of the process that has called fork. Fork is a very demanding system call. All the data of the
process have to be copied, and these can easily run to a few megabytes.

execve :

The system call execve enables a process to change its executing program. Linux permits a
number of formats for executable files. Linux supports the widely used executable file format
COFF(Common Object File Format) and ELF(Executable and Linkable Format).
exit :

A process is always terminated by calling the kernel function do_exit. This is done either directly by
the system call _exit or indirectly on the occurrence of a signal which cannot be intercepted. It
merely has to release the resources claimed by the process and, if necessary, inform other
processes.

wait :

The system call wait enables a process to wait for the end of a child process and interrogate the
exit code supplied. Depending on the argument given, wait4 will wait for a specified child process,
a child process in a specified process group or any child process.

Q. What is the output of command ps ?

Ans : ps command output which processes are running at any instant. Linux assigns a unique number to
every process running in memory. This number is called process ID or simply PID.

PID TTY TIME COMMAND


2269 tty01 0:05 sh
2396 tty01 0:00 ps

PID : Process ID
TTY : Terminal Id Which The Processes Were Launched
TIME : The Time That Has Elapsed since the Processes Were Launched
COMMAND : The Names Of The Processes.

Q: What is links ? What is the difference between Hard links & Symbolic links ?

Ans : If you might want to reference a file using different different filenames to access it from
different directories then you create a link of that file with the help of ln command.

$ ln original-file-name link-name

Hard links & Symbolic links :

Links within one disk & one user environment is called Hard links. A hard link may in some
situations fail when you try to link to a file on some other user’s directory. A file in one file system
can’t be linked by a hard link to a file in another file system. If you try to link to a file on another
user’s directory that is located on another file system, your hard link will fail. To overcome this
restriction, you use symbolic links. A symbolic link holds the pathname of the file to which it is
linking.

Chapter 4 : Memory Management


Q Define the architecture - independent memory model in Linux ?

Ans : Memory Management is primarily concerned with allocation of main memory to requests
processes. Two important features of memory management function are : Protection and Sharing.
Memory management activity in a Linux kernel. Some of main issues related to memory
management are :

Pages of Memory :

The physical memory is divided into pages. The size of a memory page is defined by the
PAGE_SIZE macro. For the x86 processor, the size is set to 4 KB, while the Alpha processor uses
8 KB.

Virtual address space :

A process is run in a virtual address space. In the abstract memory model, the virtual address
space is structured as a kernel segment plus a user segment. Code and data for the kernel can be
accessed in the kernel segment, and code and data for the process in the user segment. A virtual
address is given by reference to a segment selector and the offset within the segment. When code
is being processed, the segment selector is already set and only offsets are used. In the kernel,
however, access is needed not only to data in the kernel segment but also to data in the user
segment, for the passing of parameters. For this purpose, the put_user() and get_user() functions
are defined.

Programmers casually refer to a memory address as the way to access the contents of a memory
cell. In x86 Micro processors, we have three kind of address.

(i) Logical Addresses :

Included in the machine language instructions to specify the address of an operand or of an


instruction. Each logical addresses consists of a segment and an offset that denotes the
distance from the start of the segment to the actual address.

(ii) Linear Address :

A single 32 bit unsigned integer that can be used to address upto 4 GB, that is upto 2 32
memory cells. Linear addresses are usually represented in hexa decimal notation; Their
values ranges from 0x00000000 to 0xffffffff.

(iii) Physical Address :

Physical address is used to address memory cells included in memory chips. They
correspond to the electrical signals sent along the address pins of the microprocessor to
the memory bus. Physical Address are represented as 32 bit unsigned integer.

Converting the Linear address :

Linux adopted a three - level paging model so paging is feasible on 64 bit architectures. The x86
processor only supports a two - level conversion of the linear address. While Alpha processor
supports three-level conversion because the Alpha processor supports linear addresses with a
width of 64 bits.

Three level paging model defines three types of paging table :

Page (Global) directory


Page middle directory
Page Table

Page Global Directory :

Page Global Directory includes the addresses of several page middle directory. It is of 12 bit
length. Different functions available for modification of Page Global directory are :

(i) pgd_alloc () : Allocates a Page Directory and filles with 0.


(ii) pgd_bad() : Can be used to test whether the entry in Page Directory is valid. (iii)
pgd_clear() : Delete the entry in page directory.
(iv) pgd_free() : Releases the page of memory allocate to page directory.
(v) pgd_none() : Tests whether the entry has been initialized.
Page Middle Directory :

It includes the address of several Page Tables. It is of 13 bit length. Functions used for handling
Page Middle directory are :

(i) pmd_alloc() : Allocates a Page Middle directory to manage memory in


user area.
(ii) pmd_bad() : Test whether the entry in the Page Middle directory is valid.
(iii) pmd_clear() : Deletes the entries in the page middle directory is valid.
(iv) pmd_free() : Releases a Page Middle Directory for memory in user segment. (v)
pmd_offset(): Returns the address of an entry in the page middle directory to
which the address in argument is allocated.
(vi) pmd_none() : Tests whether the entry in the page middle directory has been
set.

Page Table :

Each Page Table entries points to page frames. It is of 25 bits length. The ‘dirty’ attribute is set
when the contents of the memory page has been modified. A page table entry contains a number
of flags which describe the legal access modes to the memory page and their state :

PAGE_NONE : No physical memory page is referenced by page table entry.


PAGE_SHARE : All types of Access are permitted.
PAGE_COPY : This macro is historical & identical to PAGE_READONLY.
PAGE_READONLY: Only read and execute access is allowed to this Page of
memory.
PAGE_KERNEL : Access to this page of memory is only allowed in the kernel
segment.

Following are some functions have been defined to mainpulate the page table entries and their
attributes :

(i) mk_pte() : Returns a page table entry generated from the memory address
of a page and a variable of the pgprot_t type.
(ii) pte_alloc() : Allocates new page table.
(iii) pte_clear() : clears the page table entry.
(iv) pte_dirty() : checks whether ‘dirty’ attributes is set.
(v) pte_free() : Releases the page table.

Q Define the Virtual Address Space for a process in LINUX ?

Ans : The Virtual Address Space of a Linux process is segmented : a distinction is made between the
kernel segment and the user segment. For the x86 processor, two selectors along with their
descriptors must be defined for each of these segments. The data segment selector only permits
data to be read or modified, while the code segment selector allows code in the segment to be
executed and data to be read. The user process can modify its local descriptor table, which holds
the segment descriptors.

The user segment :

In User Mode, a process can access only the user segment. As the user segment contains the
data and code for the process, this segment needs to be different from those belonging to other
processes, and this means in turn that the page directories, or at least the individual page tables
for the different processes, must also be different. In the system call fork, the parent process’s
page directories and page tables are copied for the child process. An exception to this is the kernel
segment, whose page tables are shared by all the processes.

The system call fork has an alternative : clone. Both system calls genrate a new thread, but in
clone the old thread and the thread generated by clone can fully share the memory. Thus, Linux
regards threads as tasks which share their address space with other tasks. The handling of
additional task - specific resources, such as the stack, can be controlled via parameters of the
system cal clone.
Virtual memory :

All Linux systems provide a useful abstraction called virtual memory. Virtual memory acts as a
logical layer between the application memory requests and the hardware Memory management
Unit (MMU). Virtual memory has many purposes and advantages:
• Several processes can be executed concurrently.
• It is possible to run applications whose memory needs are larger than the available physical
memory.
• Processes can execute a program whose code is only partially loaded in memory.
• Each process is allowed to access a subset of the available physical memory.
• Processes can share a single memory image of a library or program.
• Programs can be relocatable, that is, they can be placed anywhere in physical memory.
• Programmers can write machine-independent code, since they do not need to be concerned
about physical memory organization.

A virtual memory area is defined by the data structure vm_area_struct. The structure
vm_operations_struct defines the possible function pointers enabling different operations to be
assigned to different areas.

System call brk :

At the start of a process the value of brk field in the process table entry point to the end of the
BSS (Bash memory segment) segment for non-statically initialized data. By modifying thus
pointer the process can allocate and release dynamic memory.

The system call brk can be used to find the current value of the pointer or to set it to a new value.
If the argument is smaller than the pointer to the end of process code, the current value of brk will
be returned. Otherwise an attempt will be made to set a new value.

The kernel function sys_brk() calls do_mmap() to map a private and anonymous area between
the old and new values of brk, corrected to the nearest page boundary and returns new brk value.

The kernel segment :

A Linux system call is generally initiated by the software interrupt 0x80 being triggered. The
processor then reads the gate descriptor stored in the interrupt descriptor table. The processor
jumps to this address with the segment descriptor in the CS register pointing to the kernel
segment. The assembler routine then sets the segment selectors in the DS and ES registers in
such a way that memory accesses will read or write to data in the kernel segment.

As the page tables for the kernel segment are identical for all processes, this ensures that any
process in system mode will encounter the same kernel segment. In the kernel segment, physical
addresses and virtual addresses are the same except for the virtual memory areas mapped by
vmalloc().

In an x86 processor, the next step involves loading to the segment register FS a data segment
selector pointing to the user segment. Accesses to the user segment can then be made using the
put_user() and get_user() functions mentioned earlier. This may cause a general protection error,
if the referenced address is protected. And occur a page fault error, if page can’t be access. To
avoid these problems, system routines have to call the verify_area() function before they access
the user segment. This checks whether read or write access to the given area of the user segment
is permitted, investigating all the virtual memory areas affected by the area involved.

Q Define the static & Dynamic memory allocation in the kernel segment ?

Ans : Static memory allocation in the kernel segment :

In the system kernel, it is often necessary to allocate memory for kernel process. Before a kernel
generates its first process when it is run, it calls initialization routines for a range of kernel
components. These routines are able to reserve memory in the kernel segment. The initialization
routine is start_kernel(). The initialization function reserves memory by returning a value higher
than the parameter memory_start.
Dynamic memory allocation in the kernel segment :

The functions used for Dynamic memory allocation are kmalloc() and kfree(). The kmalloc()
function attempts to reserve the extent of memory specified by size. The memory that has been
reserved can be released again by the function kfree(). The function _get_fee_pages() may be
called and, if no free pages are available and other pages therefore need to be copied to
secondary storage, this may block.

In the Linux kernel, the _get_free_pages() function can only be used to reserve contiguous areas
of memory. As kmalloc() can reserve far smaller areas of memory, however, the free memory in
these areas needs to be managed. The central data structure for this is the table sizes[ ], which
contains descriptors for different sizes of memory area.

One page descriptor manages each contiguous area of memory. This page descriptor is stored at
the beginning of every memory area reserved by kmalloc(). Within the page itself, all the free
blocks of memory are managed in a linear list. All the blocks of memory in a memory area
collected into one list are the same in size.

The block itself has a block header, which in turn holds a pointer to the next element if the block is
free, or else the actual size of the memory area allocated in the block.

Structures for kmalloc

Kmalloc provided the only facility for dynamic allocation of memory in the kernel. In addition, the
amount of memory that could be reserved was restricted to the size of one page of memory . The
situation was improved by the function vmalloc() and its counterpart vmfree(). The advantage of
the vmalloc() function is that the size of the area of memory requested can be better adjusted to
actual needs than when using kmalloc(), which requires 128 KB of consecutive physical memory to
reserve just 64 KB. Besides this, vmalloc() islimited only by the size of free physical memory and
not by its segmentation, as kmalloc() is. Since vmalloc() does not return any physical addresses
and the reserved areas of memory can be spread over non-consecutive pages, this function is not
suitable for reserving memory for DMA.

Q Define the update and bdflush processes ?

Ans : The update process is a Linux process which at periodic intervals calls the system call bdflush with
an appropriate parameter. All modified buffer blocks that have not been used for acertain time are
writeen back to disk, together with all superblock and inode information. The interval used by
update as a default under Linux is five seconds.

bdflush is implemented as a kernel thread and is started during kernel initialization. In an endless
loop, it writes back the number of block buffers marked ‘dirty’ given in the bdflush parameter
( default is 500). Once this is completed, a new loop starts immediately it the proportion of modified
block buffers to the total number of buffers to the total number of buffers in the cache becomes too
high. Otherwise, the process switches to the TASK_INTERRUPTIBLE state.

The kernel thread can be woken up using the wakup_bdflush() function.

Q Define the paging under Linux ?

Ans : The RAM memory in a computer has always been limited and, compared to fixed disks, relatively
expensive. Particularly in multi-tasking operating systems, the limit of working memory is quickly
reached. Thus it was not long before someone hit on the idea of offloading temporarily unused
areas of primary storage(RAM) to secondary storage.

The traditional procedure for this used to be the so-called ‘swapping’ which involves saving entire
processes from memory to a secondary medium and reading them in again. This approach does
not solve the problem of running processes with large memory requirements in the available
primary memory. Besides this, saving and reading in whole processes is very inefficient.
When new hardware architectures (VAX) were introduced, the concept of demand paging was
developed. Under the control of a memory management unit (MMU) the entire memory is divided
up into pages, with only complete pages of memory being read in or saved as required. As all
modern processor architectures, including the x86 architecture, support the management of paged
memory, demand paging is employed by Linux. Pages of memory which have been mapped
directly to the virtual address area of a process using do_mmap() without write authorization are
not saved, but simply discarded. Their contents can be read in again from the files which were
mapped. Modified memory pages, in contrast, must be written into swap space.

Pages of memory in the kernel segment cannot be saved, for the simple reason that routines and
data structures which read memory pages back from secondary storage must always be present in
primary memory.

Linux can save pages to external media in two ways. In the first, a complete block device is used
as the external medium. This will typically be a partition on a hard disk. The second uses fixed-
length files in a file system for its external storage. The term ‘swap space’ may refer to either a
swap device or a swap file.

Using a swap device is more efficient than using a swap file. In a swap device, a page is always
saved to consecutive blocks, whereas in a swap file, the individual blocks may be given various
block numbers depending on how the particular file system fragmented the file when it was set up.
These blocks then need to be found via the swap file’s inode. On a swap device, the first block is
given directly by the offset for the page of memory to be saved or read in.

Chapter 5 ( IPC-INTER PROCESS COMMUNICATION)


Q Define the IPC ?

Ans : There are many applications in which processes need to cooperate with each other. The Linux
IPC (Inter Process communication) facility provides many methods for multiple process to
communicate with each other.

A variety of forms of inter-process communication can be used under Linux. These support
• resource sharing
• synchronization
• connectionless and
• connection oriented data exchance

Resource sharing :

If processes have to share a resource (such as printer). It is important to make sure that no more
than one process is accessing the resource- that is, sending data to the printer-at any given time. If
different process send data on same time the race condition is fired, and communication between
process must prevent it. Eliminating race condition is only one possible use of inter-process
communication.

Synchronization in the kernel :

As the kernel manages the system resources, access by processes to these resources must be
synchronized. A process will not be interrupted by the scheduler so long as it is executing a system
call. This only happens it it locks or itself calls schedule() to allow the execution of other process.
Whenever a process is running in its critical section no other process running in its critical section,
for achieving this different schronization methods are provided by Linux IPC.
Connection less data exchange :

In connection less data exchange a process simply sends data packets, which may be given a
destination address or a message type, and leaves it to the infrastructure to deliver them. For
example : - we send a letter we rely on a connection less model.

Connection oriented data exchange :

In connection-oriented data exchange, the two parties to the communication must set up a
connection before communication can start. For example :- we make a telephone call, and an
client application give the request for server by client socket and server socket receive the request
and create the connection, we are using a connection - oriented data exchange.

Q How Linux implements all the forms of interprocess communication explain briefly?

Ans : Linux implements the Interprocess communication in different forms :-

Communication by files :

Communication via files is in fact oldest way of exchanging data between programs. Program A
writes data to a file and program B reads the data out again. In a multi-tasking system, however
both programs could be run as processes at least quasi-parallel to each other. Race conditions
then usually produce inconsistencies in the file data, which result from one program reading a data
area before the other has completed modifying it, or both processes modifying the same area of
memory at the same time. Avoiding the race conditions in files different types of locking
mechanisms used in Linux :-

Mandatory Locking : -

Mandatory locking blocks read and write operations throughout the entire area.

There are two methods for locking entire files.

In addition to the file to be locked there is an auxiliary file known as a Lock file is created, which
refuses access to the file when it is present. The system call link, create, open used for this
locking. link system call create the lock file if lock file does not yet exist. create aborts with an
error code if the process which is being called does not possess the appropriate access right.
In open the lock file is opened if it does not already exist.

The drawback to all three of these is that after a failure the process must repeat its attempt to
set up a lock file. Usually, the process will call sleep() to wait for one second and then try
again.

Lock the entire file by means of fcntl system call. This functions is invoked either
through flock() or lock() system call.

2. Advisory Locking : -

With advisory locking, all processes accessing the file for read or write operations have to set the
appropriate lock and release it again.

Locking file areas is usually refereed as record locking. Advisory locking of file areas can be
achieved with the system call fcntl. The prototype of fcntl() is

Int sys_fcntl(unsigned int fd, unsigned int cmd, unsigned long arg);

fd : The parameter fd is used to pass a file descriptor.


cmd : command for locking purpose it can be F_GETLK, F_SETLK, FSETLKW
arg : arg must be a pointer to an flock structure which store the lock type ( F_RDLCK,
F_WRLCK, F_UNLCK, F_SHLCK, or F_EXLCK), start position, length, process id.

Semantics of fcntl locks.


Existing Locks Set read lock Set write lock

None Possible Possible

More than one Possible Not legal


read lock

One write lock Not legal Not legal

List of locked file are managed by a Doubly linked list file_lock_table.

Pipes : -

A PIPE is a one-way flow of data between processes : all the data written by a processes to the
Pipe is routed by the kernel to another process, which can thus read it.

In UNIX shells, pipes can be created by means of | operator. For example the following statement
instructs the shell to create two processes connected by a pipe.

$ ls | more

The standard output of the first process, which executes the ls program, is redirected to the pipe;
the second process, which executes the more program, reads its input from the pipe.

Another varient of pipes consists of named pipes, also known as FIFOs. They can be set up in a
file system using the command

$ mkfifo filename

pipes are special type of files in Linux, which file type is p.

The system call pipe creates a pipe, which involves setting up a temporary inode and allocating a
page of memory. The call returns one file descriptor for reading and one for writing.

System V IPC : -

IPC is an abbreviation that stands for interprocess communication. The classical forms of inter-
process communication-semaphores, message queues and shared memory-were implemented in
a special variant of UNIX. These were later integrated into System V and are now known as
System V IPC. It denotes a set of system calls that allows a user mode process to :

Synchronize itself with either process by means of semaphores.


Send messages to other processes or receive messages from them.
Share a memory area with other process.

IPC data structures are created dynamically when a process requests an IPC resource ( a
semaphore, a message queue, or a shared memory segment). An IPC resource may be used by
any process, including those that do not share the ancestor that created the resource.

Since a process may require several IPC resources of same type, each new resource is identified
by a 32 bit IPC key, which is similar to the file pathname in the system’s directory tree. IPC
identifiers are assigned to IPC resources by the kernel and are unique within the system, while IPC
keys can be freely chosen by programmers.

Access permissions are managed by the kernel in the structure ipc_perm .


Semaphores :

Semaphores are counters used to provide controlled access to shared data structures for multiple
processes. The semaphore value is positive if the protected resource is available, and negative or
zero if the protected resource is currently not available. A process that wants to access the
resource decremented by 1 the semaphore value. It is allowed to use the resource only it the old
value was positive; otherwise the process waits until the semaphore becomes positive. Depending
on no of resources. An array of semaphores can be set up using system calls.

Struct semaphore
{
int count;
struct wait_queue *wait;
};

A semaphore is taken to be occupied if count has value less than or equal to 0. All the process
wishing to occupy the semaphore enter themselves in the wait queue. They are then notified when
it is released by another process. There are two auxiliary functions to occupy or release
semaphore, up() and down() .

Message queues :

Process can communicate with each other by means of IPC messages. Each message generated
by a process is sent to an IPC message queue where it stays until another process reads it.

A message is composed of a fixed sized header and a variable length text; it can be labeled with
an integer value ( the message type), which allows a process to selectively retrieve messages from
its message queue. Once a process has read a message from the IPC message queue, the kernel
destroys it; therefore, only one process can retrieve a given message.

In order to send a message, a process invokes the msgsnd() function, passing as parameters :

• The IPC identifier or the destination message


• The site of message text
• The address of a user mode buffer that contains the message type immediately followed by
the message text.

To retrieve a message, a process invokes the msgrcv() function, passing to it :

• The IPC identifier of the IPC message queue resource.


• The pointer to a user mode buffer to which the message type and message text should by
copied
• The site of this buffer
• A value t that specifies what message should be retrieved

Shared Memory :

The most useful IPC mechanism is shared memory, which allows two or more processes to access
some common data structures by placing them in a shared memory segment. Each process that
wants to access the data structures included in a shared memory segment must add to its address
space a new memory region, which maps the page frames associated with the shared memory
segment. Such page frames can thus be easily handled by the kernel through demand paging.

Shmget() function is invoked to get the IPC identifier of a shared memory segment, optionally
creating it if it does not already exist.

The drawback to shared memory is that the processes need to use additional synchronization
mechanisms to ensure that race conditions do not arise.
Q Define the system call ptrace ?

Ans : Execution Tracing is a technique that allows a program to monitor the execution of another
program. The traced program can be executed step-by-step, until a signal is received, or until a
system call is invoked. Execution tracing is widely used by debuggers, together with other
techniques like the insertion of breakpoints in the debugged program and run-time access to its
variables. In Linux, execution tracing is performed through the ptrace() system call, which can
handle the following commands :

PTRACE_TRACEMEStart execution tracing for the current processPTRACE_ATTACHStart execution


tracing for another processPTRACE_DETACHTerminate execution tracingPTRACE_KILLKill the
traced processPTRACE_PEEKTEXTRead a 32 bit value from the text
segmentPTRACE_PEEKDATARead a 32 bit value from the data
segmentPTRACE_POKETEXTWrite a 32 bit value from the text
segmentPTRACE_POKEDATAWrite a 32 bit value from the data
segmentPTRACE_CONTResume execution
Several monitored events can be associated with a traced program :

• End of execution of a single assembly instruction


• Entering a system call
• Exiting from a system call
• Receiving a signal

When a monitored event occurs, the traced program is stopped and a SIGCHLD signal is sent to
its parent. When the parent wishes to resume the child’s execution, it can use one of the
PTRACE_CONT.
A process can also be traced using some debugging features of the Intel Pentium processors. For
example, the parent could set the values of the dr0,….dr7 debug registers for the child by using
the PTRACE_POKEUSR command. When a monitored event occurs, the CPU raises the “Debug”
exception; the exception handler can then suspend the traced process and send the SIGCHLD
signal to the parent.

Chapter 6 : The Linux file system


Q The Explain the representation of file systems in the kernel of Linux?

Ans: The file system is the most visible aspect of an operating system. It provides the mechanism for
on-line storage of and access to both data and programs of the operating system. A central
demand made of a file system is the purposeful structuring of data. When selecting a purposeful
structure, however, two factors not to be neglected are the speed of access to data and a facility
for random access.

Each file system starts with a boot block. This block is reserved for the code required to boot the
operating system.

The range of file systems supported is made possible by the unified interface to the Linux kernel.
This is the Virtual File System Switch (VFS). The virtual file system is a kernel software layer
that handles all system calls related to a standard Linux filesystem. Its main strength is providing
a common interface to several kinds of filesystems.

For instance, let us assume that a user issues the shell command:

$ cp /mnt/floppy/TEST /tmp/test

Where /mnt/floppy is the mount point of an MS-DOS diskette and /tmp is a normal EXT2
directory. The cp program is not required to know the filesystem types of /mnt/floppy/TEST and
/tmp/test. Instead, cp interacts with the VFS by means of generic system calls well known to
anyone who has done Linux programming.

Whenever a different filesystem is used, first register the filesystem. This is the responsibility of the
VFS, which call the register_filesystem(). This functions fills the information of
file_system_type structure, which store the information about the filesystem.
Once a file system implementation has been registered with the VFS, file system of this type can
be administered.

The common file model consists of the following structure types :

Mounting
The superblock structure
The inode structure
The file structure

Mounting :

Before a file can be accessed, the file system containing the file must be mounted. This can be
done using either the system call mount or the function mount_root(). The mount_root function
takes care of mounting the first file sytem. It is called by the system call setup after all the file
system implementations permanently included in the kernel have been registered. The setup call
itself is called just once, immediately after the init process is created by the kernel function init().

The superblock :

All the information which is essential for managing the file system is held in the superblock. Every
mounted file system is represented by a super_block structure. These structures are held in the
static table super_block[ ]. The superblock is initialized by the function read_super() in the
Virtual File System. The superblock contains information on the entire file system, such as block
size, access rights and time of the last change. The superblock also holds references to the file
system’s root inode.

Some important possible operations on super_block structure are as follows :

write_super() : The write_super function is used to save the information of the superblock.

put_super() : The VFS calls this function when unmounting file systems, when it should
also release the superblock and other information buffers.

read_inode() : The inode structure is initialized by this function like read_super() fills
super_block structure.

notify_change() : The changes made to the inode via system calls are acknowledged by
notify_change().
write_inode() : This function saves the inode structure, analogous to write_super().

The inode :
Some important possible operations on inode structure are as follows :

Create() : creates a new disk inode for a file.


Lookup() : searches a inode for given file.
Link() : This function sets up a hard link.
Unlink(): This function deletes the specified file in the directory specified.
Symlink() : create a symbolic link.

The file structure :


The file structure describes how a process interacts with a file it has opened. The structure is
created when the file is opened and consists of a file structure. The structure contains information
on a specific file’s access rights f_mode, the current file position f_pos, the type of access f_flags
and the number of accesses f_count. The file structures are managed in a doubly linked list via
the pointers f_next and f_prev. This file table can be accessed via the pointer first_file.

Some important possible operations on inode structure are as follows :

Lseek() : The job of the lseek function is to deal with positioning within the file.
Read(): This function copies count bytes from the file into the buffer buf in the user address
space.
Write(): The write function operates in an analogous manner to read() and copies data from
the user address space to the file.
Select(): This function checks whether data can be read from a file or written to one.
Ioctl(): The ioctl() function sets device-specific parameters.

Q Explain the proc filesystem ?

Ans : Linux supports different filesystem so in this place explain the process file system(proc) of
system V Release 4. Each process in the system which is currently running is assigned a directory
/proc/pid, where pid is the process identification number of the relevant process. This directory
contains files holding information on certain characteristics of the process.

When the Proc file system is mounted, the VFS function read_super() is called by do_mount(),
and in turn calls the function pror_read_super() for the Proc file system in the file_system list.

iget() generate the inode for the proc root directory, which is entered in the superblock.
parse_options() function then processes the mount options data that have been provided and
sets the owner of the root inode.

Accessing the file system is always carried out by accessing the root inode of the file system. The
first access is made by calling iget(). If the inode does not exist, this function then calles the
proc_read_inode() function entered in the proc_sops structure.

This inode describes a directory with read and execute permissions for all processes. The
proc_root_inode_operations only provides two functions: the component readdir in the form of
the proc_readroot() function and the component lookup as the proc_lookuproot() function. Both
function operate using the table root_dir[ ], which contains the different entries for the root
directory.

The individual structures contain the inode number, the length of the filename, and the name itself.
Proc_lookuproot(), which determines the inode of a file by reference to the inode for the directory
and the name of a file contained in it.

The function proc_read_inode(), the inode for most normal files is assigned the function vector
proc_array_inode_operations. All that is implemented in this, however, is the function
array_read() in the standard file operations to read the files.

Q Explain the Linux filesystem (ext2)?

Ans : As Linux was initially developed under MINIX, it is hardly surprising that the first LINUX file system
was the MINIX file system. However, this file system restricts partitions to a maximum of 64 MB
and filenames to no more than 14 characters, so the search for a better file system was not long in
starting. The result was the Ext file system - the first to be designed especially for LINUX. Although
this allowed partitions of up to 2 GB and filenames up to 255 characters. It included several
significant extensions but offered unsatisfactory performance. The second Extended Filesystem
(Ext2) wasintroduced in 1994 : besides including several new features, it is quite efficient and
robust and has become the most widely used LINUX file system.

The most significant features are :


Block fragmentation :

System administrators usually choose large block sizes for accessing recent disks. As a result,
small files stored in large blocks waste a lot of disk space. This problem can be solved by allowing
several files to be stored in different fragments of the same block.

Access Control Lists :

Instead of classifying the users of a file under three classes - owner, group, and others - an access
control list (ACL) is associated with each file to specify the access rights for any specific users or
combinations of users.

Handling of compressed and encrypted files :

The new option, which must be specified when creating a file, will allow users to store compressed
and / or encrypted versions of their files on disk.

Logical deletion :

An undelete option will allow users to easily recover, if needed, the contents of previously removed
file.

The structure of the Ext2 file system :

The first block in any Ext2 partition is never managed by the Ext2 filesystem, since it is reserved for
the partition boot sector. The rest of the Ext2 partition is split into block group. Block groups reduce file
fragmentation, sice the kernel tries to keep the data blocks belonging to a file in the same block group
if possible. Each block in a block group contains one of the following pieces of information :

• A copy of the filesytem’s superblock


• A copy of the group of block group descriptors
• A data block bitmap
• A group of indoes
• An inode bitmap
• A chunk of data belonging to a file; that is, a data block

An Ext2 disk superblock is stored in an ext2_super_block structure, which contains the Total number
of inodes, Filesystem size in blocks, number of reserved blocks, free blocks counter, Free inodes
counter, block size, fragement size and other important information.

Each block group has its own group descriptor, an ext2_group_desc structure and contains the inode
table.

Directories in the Ext2 file system

In the Ext2 file system, directories are administered using a singly linked list. Ext2 implements directories
as a special kind of file whose data blocks store filenames together with the corresponding indoe
numbers. In particular, such data blocks contain structres of type ext2_dir_entry2. The structure has a
variable length, since the last name field is a variable length array of up to EXT2_NAME_LEN characters
(usually 255). The name_len field stores the actual file name length. The rec_len field may be interpreted
as a pointer to the next valid directory entry : it is the offset to be added to starting address of the directory
entry to get the starting address of the next valid directory entry.

Block allocation in the Ext2 file system

A problem commonly encountered in all file systems is the fragmentatation of files- that is, the ‘scattering’
of files into small pieces as a result of the constant deleting and creating of new files. The Ext2 file system
uses two algorithms to limit the fragmentation of files.
Target-oriented allocation :

This algorithm always looks for space for new data blocks in the area of a ‘target block’. If this block is
itself free, it is allocated. Otherwise, a free block is sought within 32 blocks of the target block, and if
found, is allocated. If this fails, the block allocation routine tries to find a free block which is at least in the
same block group as the target block. Only after these avenues have been exhausted are other block
groups investigated.

Pre-allocation :

If a free block is found, up to eight following blocks are reserved (if they are free). When the file is closed,
the remaining blocks still reserved are released. This also guarantees that as many data blocks as
possible are collected into one cluster.

Chapter 7 : Device drivers under Linux


Device drivers is an interface between device and O. S. Device driver is a software which operate the
hardware. There is a wide variety of hardware available for LINUX computers. Each hardware have an
own device driver. Without these, an operating system would have no means of input or output and no
file system. Device drivers are uniquely identified by their major numbers. A device driver may be
controlling a number of physical and virtual devices, for example a number of hard disks and partitions;
thus, the individual device is accessed via its minor number, an integer between 0 and 255. Each
individual device can thus be uniquely identified by the device type (block or character), the major
number of the device driver and its minor number.

Q Explain character and block devices under Linux. ?

Ans : Block devices :

Block devices are those to which transfer the data in block wise and provide the facility of random
access. Block devices are divided into a specific number of equal - sized blocks and each block
have a unique number. So file system define the address system with the help of these block
number. Using this address you can access any data random whenever you want at any location
directly. for read and write from block device, Linux maintain a buffer area in RAM. Random access
is an absolute necessity for file systems, which means that they can only be mounted on block
devices. RAM, Hard disk, Floppy disk, CD-ROM all are block devices.

Character devices :

Character devices on the other hand processed data character by character and sequentially.
And Linux doesn’t maintain the buffer area for that. Some character devices maintain its own
buffer for its internal operation for block transferring but These blocks are sequential in nature, and
cannot be accessed randomly. For example - a ink printer and laser printer print the character in
line and page wise respectively so all characters stores in buffer and when a required limit is
reach, device send whole block of data to printing. Some character devices are : Printer,
Scanner, sound cards, monitor, PC speaker.

Q In the context of LINUX device drivers, write short notes of the following :

Polling Interrupt
Interrupt Sharing Bottom Halves
Task Queues DMA

Ans : Polling :

In polling, the driver constantly checks the hardware. The driver defines a timeout (jiffies + waiting
time), and driver continuously check the hardware until timeout limit is not reach. Whenever a
timeout limit is over the timeout error handling will then give the appropriate error messages in
case of printer like printer is out of paper, offline. In polling mode results pointless wasting of
processor time; but it is sometimes the fastest way of communicating with the hardware. The
device driver for the parallel interface works by polling as the default option.
Interrupt :

The use of interrupt, on the other hand, is only possible if these are supported by the hardware.
Here, the device informs the CPU via an interrupt channel (IRQ) that it has finished an operation.
This breaks into the current operation and carries out an interrupt service routine (ISR). Further
communication with the device then takes place within the ISR.

In the serial mouse, every movement of which sends data to the serial port, triggering an IRQ. The
data from the serial port is read first by the handling ISR, which passes it through to the application
program.

IRQs are installed using the function :

Request_irq() in which pass different parameters like irq number, address of handling routine,
device name, device id, and irqflags.

Irqflags specifies the type of interrupt. If irqflags is off (NULL) then interrupt is slow interrupt, if is
set the value SA_INTERRUPT then interrupt is a fast interrupt, if SA_SHIRQ then it is a sharable
interrupt.

Interrupt sharing :

Various hardware is used the same irq number. If different hardware which used same interrupt,
are used in same PCI board then hardware are conflict each other. In this case interrupt sharing
provides the facility to use both device in same PCI board. For this if one device is used the PCI
buses the second device wait for freeing that buses. If an ISR capable of interrupt sharing is
installed, this must be communicated to the request_irq() function by setting the SA_SHIRQ flag. If
another ISR also capable of interrupt sharing was already installed on this interrupts, a chain is
built.

Bottom Halves :

It frequently happens that not all the functions need to be performed immediately after an interrupt
occurs; although ‘important’ actions need to be taken care of at once, others can be handled later
or would take a relatively long time and it is preferable not to block the interrupt. A bottom half is a
low-priority function, usually related to interrupt handling, that is waiting for the kernel to find a
convenient moment to run it.

Before invoking a bottom half for the first time, it must be initialized. This is done by invoking the
init_bh() function, which inserts the routine address in the nth entry of bh_base. bh_base table to
group all bottom halves together. It is an array of pointers to bottom halves and can include up to
32 entries, one for each type of bottom half.

Some Linux Bottom Halves are as follows:

CONSOLE_BH : Virtual console


KEYBOARD_BH : Keyboard
NET_BH : Network Interface
SCSI_BH : SCSI interface
SERIAL_BH : Serial port
TIMER_BH : Timer

Task Queues :

Task queue is a dynamic extension of the concept of bottom halves. Use of bottom halves is
somewhat difficult because their number is limited to only 32, and some tasks are already
assigned to fixed numbers. Task queue allow a number of functions to be entered in a queue and
processed one after another at a later time.

A queue element is described by the tq_struct which holds :

- the pointer to next entry in *next


- synchronization flag sync
- function to be called
- argument passed to the function at call time in *data.

Before a function can be entered in a task queue, a tq_struct structure must be created and
initialized.

DMA mode :

Direct memory access or DMA, is the hardware mechanism that allows peripheral components to
transfer their I/O data directly to and from main memory without the need for the system processor
to be involved in the transfer. Use of this mode is ideal for multi-tasking, as the CPU can take care
of other tasks during the data transfer. The device will generally trigger an IRQ after the transfer,
so that the next DMA transfer can be prepared in the ISR handling the procedure.

In a DMA operation the data transfer takes place without CPU intervention : the data bus is directly
driven by the I/O device and the DMAC(Direct Memory Access controller). Therefore, when the
kernel sets up a DMA operation, it must write the bus address of the memory buffer involved in the
proper I/O ports of the DMAC or I/O device.

Q How a driver can be implemented explain with following functions :

setup init
open release
read write
IOCTL select

Ans : setup () :

The setup() function must initialize the hardware devices in the computer and set up the
environment for the execution of the kernel program. Although the BIOS already initialized most
hardware de4vices, Linux does not rely on it but reinitializes the devices in its own manner to
enhance portability and robustness. Sometimes it is desirable to pass parameters to a device
driver or to the Linux kernel in general. These parameters will come in the form of a command line
from the Linux loader LILO. This command line will be analyzed into its component parts by the
function parse_options(). The checksetup() function is called for each of the parameters and
compares the beginning of the paramerer with the string stored in the bootsetups[ ] field, calling
the corresponding setup( ) function whenever these match. The checksetup() function will attempt
to convert the first ten parameters into integer numbers. If this is successful, they will be stored in a
field.

Init() :

The init() function is only called during kernel initialization, but is responsible for important tasks.
This function tests for the presence of a device, generates internal device driver structures and
registers the device.

The call to the init function must be carried out in one of the following functions, depending on the
type of device driver:

For

Character devices : chr_dev_init()


Block devices : blk_dev_init()
SCSI devices : scsi_dev_init()
Network devices : net_dev_init()

Before Linux can make use of the driver, it must be registered using the functions
register_chrdrv().

The init() function is also the right place to test whether a device supported by the driver is present
at all. This applies especially for devices which cannot be connected or changed during operation,
such as hard disks.
Open ():

The open function is responsible for administering all the devices and is called as soon as a
process opens a devices file. If only one process can work with a given device. -EBUSY should be
returned if other device wants to open the device. If a device can be used by a number of
processes at the same time, open() should set up the necessary wait queues. If no device exists it
should return -ENODEV. The open() function is also the right place to initialize the standard
settings needed by the driver.

Release() :

The release() function is only called when the file descriptor for the device is released. The tasks of
this function comprise cleaning-up activities global in nature, such as clearing wait queues. For
some devices it can also be usefule to pass through to the device all the data still in the buffers.

Read() & write() :

The read() and write() functions perform a similar task, that is, copying data from and to application
code. Whenever a input device is used read() function is fired and for output devices write()
function is fired, because only read operation is possible by input device like mouse, keyboard and
only write operation is possible by output devices like printer, monitor.

IOCTL() :

Each device has its own characteristics, which may consist in different operation modes and
certain basic settings. It may also be that device parameters such as IRQs, I/O addresses and so
on need to be set at run-time. IOCTL usually only change variables global to the driver or global
device settings.
Select () :

The select () function checks whether data can be read from the device or written to it. If the device
is free or argument wait is NULL, the device will only be check. If it is ready for the function
concerned, select() will return 1, otherwise a 0. If wait is not NULL, the process must be held up
until the device becomes available.

Chapter 9 : Modules and debugging


Q What are modules? How implemented in the kernel ?

Ans : Modules are components of the Linux kernel that can be loaded and attached to it as needed. To
add support for a new device, you can now simply instruct a kernel to load its module. In some
cases, you may have to recompile only that module to provide support for your device. The use of
modules has the added advantage of reducing the size of the kernel program. The kernel can load
modules in memory only as they are needed. For example, the module for the BLOCK devices,
and FILE SYSTEM, whenever you use the device and use the file system.

Implementation in the kernel :

Linux provides three system calls : create_module, init_module and delete_module for
implementation of Linux modules. A further system call is used by the user process to obtain a
copy of the kernel’s symbol table.

The administration of modules under Linux makes use of a list in which all the modules loaded are
included. This list also administers the modules’ symbol tables and references.

As far as the kernel is concerned, modules are loaded in two steps corresponding to the system
calls create_module and init_modules. For the user process, this procedure divides into four
phases.

The process fetches the content of the object file into its own address space. To get the code and
data into a form in which they can actually be executed, the actual load address must be added at
various points. This process is known as relocating.
The system call create_module is now used, firstly to obtain the final address of the object module
and secondly to reserve memory for it. To do this, a structure module is entered for the module in
the list of modules and the memory is allocated. The return value gives us the address to which the
module will later be copied.

The load address received by create_module is used to relocate the object file. This procedure
takes place in a memory area belonging to the process-if process is a user process then load in
user area, and if kernel process load in kernel segment.

When a module is already use in a process and other process wish to use this then it uses the
module which earlier loaded. This mechanism is known as module stacking.

Once the preliminary work is complete, we can load the object module. This uses the system call
init_modules. cleanup() function is called when the module is deinstalled.

By using the system call delete_module, a module that has been loaded can be removed again.
Two preconditions need to be met for this : there must be no references to the modules and the
module’s use counter must hold a value of zero.

Q Define the Kernel Daemon?

Ans : The kernel daemon is a process which automatically carries out loading and removing of modules
without the system user noticing it. For example : whenever a file is accessed by floppy, so kernel
daemon load the block device module for handling the block device and load the file system
modules for particular file system. But how does the kernel daemon know that modules need to be
loaded ?

Communication between the Linux kernel and the kernel daemon is carried out by means of IPC.
The kernel daemon opens a message queue with the new flag IPC_KERNELD. The kernel sends
the messages to the kernel daemon by kerneld_send function. Request is stored in kerneld_msg
struct, which includes different information :

mtype : component contains the message


ld : indicates whether the kernel expects an answer
pid : component holds the PID of the process that triggered the kernel request.

Responsibility for loading and releasing modules lies with the functions :

request_module : kernel requests the loading of a module and waits until


the operation has been carried out.

release_module : removes a module


delayed_release_module : allows a module to be removed with a specified delay.
cancel_release_module : allows a module to be removed with a specified condition.

Q Define the Debugging ?

Ans : Debugging is the process in which find out the errors and also whenever an error is occurred at run
time, rectify that error and warn for that. Only few cases a section of program code be free of bugs
as soon as it is written. Usually the program will need debugging, for which it will be loaded into a
debugger such as gdb and run step by step until the error has been found.

The most common debugging techinque is monitoring. When you are debugging kernel code, you
can accomplish this goal with printk.

Printk :

In printk debugger, code is checked and an error occurred create the check points and print an
appropriate alarm message. For example : whenever a kernel segment process wish to call the
data and code of user segment process, verify_area () functions is fired, which check all area
related to process and if any error is occurred, call the printk debugger, which print the appropriate
message.

Gdb - GNU debugger :

Execution Tracing is a technique that allows a program to monitor the execution of another
program. The traced program can be executed step-by-step, until a signal is received, or until a
system call is invoked. Execution tracing is widely used by debuggers, together with other
techniques like the insertion of breakpoints in the debugged program and run-time access to its
variables. In Linux, execution tracing is performed through the ptrace() system call. Gdb debugger
works on ptrace () system call. Check the code and data, if any error is occurred, try to repair this
error if error is repaired then relocating the control otherwise print an appropriate message.

Chapter 10 : Multi-processing
Q Define the SMP ?

Ans : Most systems are single processor systems; that is, they have only one main CPU. But sometimes
applications require more processors power. So in this situation use the multiple processor for
close communication, sharing the computer bus, the clock, and sometimes memory and peripheral
devices. The most common multiple-processor systems now use the symmetric-multiprocessing
(SMP) model, in which each processor runs an identical copy of the operating system, and these
copies communicate with one another as needed.

Most of the currently available multi-processor main boards for PCs use i486, Pentium or Pentium
Pro processors. The Pentium already has some internal functions which support multi-processor
operation, such as cache synchronization, inter-processor interrupt handling.

It defines a highly symmetrical architecture in terms of :

Q Difference between Memory symmetry and I/O symmetry ?

Memory Symmetry :
All processors share the same main memory; in particular, all physical addresses are the same.
This means that all processors execute the same operating system, all data and applications are
visible to all processors and can be used or executed on every processor.

I/O Symmetry :

All processors share the same I/O subsystem (including the I/O port and the interrupt controller).
I/O symmetry allows reduction of a possible I/O bottleneck. However, some MP systems assign all
interrupts to one single processor and on the other hand use the I/O APIC (Advanced
Programmable Interrupt Controller). All CPU are connected by ICC (Interrupt Controller
Communications) bus.

One processor is chosen by the BIOS; it is called the boot processor (BSP) and is used for
system initialization. All other processors are called application processors (AP) and are initially
halted by the BIOS.

Problems with multi-processor systems :

For the correct functioning of a multi-tasking system it is important that data in the kernel can only
be changed by one processor so that identical resources cannot be allocated twice. For this use
coarse grained locking; sometimes even the whole kernel is locked so that only one process can
be present in the kernel. And also use the finer grained locking which, normally used only for multi-
processor and real-time operating system.

In the Linux kernel implementation, various rules were established :

No process running in kernel mode is interrupted by another process running in kernel mode,
except when it releases control and sleeps.
Interrupt handling can interrupt a process running in kernel mode, but that in the end control is
returned back to this same process. A process can block interrupts and thus make sure that it will
not be interrupted.

Interrupt handling cannot be interrupted by a process running in kernel mode. This means that the
interrupt handling will be processed completely, or at most be interrrupted by another interrupt of
higher priority.

In the development of the multi-processor LINUX kernel a decision was made to maintain these
three basic rules. All processes to monitor the transition to kernel mode use one single semaphore.
This semaphore is used to ensure that no process running in kernel mode can be interrupted by
another process. Furthermore, it guarantees that only a process running in kernel mode can block
the interrupts without another process taking over the interrupt handling.

Changes to the Kernel :

In order to implement SMP in the LINUX kernel, changes have to be made :

Kernel Initialization :

The first problem with the implementation of multi-processor operation arises when starting the
kernel. Initially BIOS running the boot processor and halted all Aps. Only this processor enters
the kernel starting function start_kernel(). After it has executed the normal LINUX initialization,
smp_init() is called. This function activates all other processors by calling smp_boot_cpus().

Scheduling :

The LINUX scheduler, which responsibility is allocated the processor to running process. The
Linux scheduler shows only slight changes. First of all, the task structure now has a processor
component which contains the number of the running processor. The last_processor
component contains the number of the processor which processed the task last.

Message exchange between processors :

Messages in the form of inter-processor interrupts are handled via interrupts 13 and 16.
Interrupt 13 is defined as a fast interrupt which, however does not need the kernel lock and can
thus always be processed. Interrupt 16 is a slow interrupt which waits for the kernel lock and
can trigger scheduling. It is used to start the schedulers on the other processors.

Entering kernel mode :

The kernel is protected by a single semaphore. All interrupt handlers, syscall routines and
exception handlers need this semaphore and wait in a processor loop until the semaphore is
free.

Interrupt Handling :

Interrupts are distributed to the processors by the I/O APIC. At system start, however, all
interrupts are forwarded only to the BSP. Each SMP operating system must therefore switch
the APIC into SMP mode, so that other processors too can handle interrupts.

Linux does not use this operating mode, that is, during the whole time the system is operating,
interrupts are only delivered to the BSP. This compromises the latency time.

You might also like