Professional Documents
Culture Documents
FILE MANAGEMENT
File System
Disks are used as a primary storage medium for information.
The file system provides a convenient mechanism to store and retrieve the data
and programs from this medium.
In fact, files are used as a collection of related information, the meaning of which
is defined by its creator or author.
These files are mapped to the disks or other storage media by the OS.
Files themselves are organized in the form of a directory.
A file is a named collection of related information that is recorded on secondary
storage such as magnetic disks, magnetic tapes and optical disks. In general, a
file is a sequence of bits, bytes, lines or records whose meaning is defined by the
files creator and user.
A file allows one to write, read, save and retrieve the program and data on any
type of storage device.
11/10/2014
ChonuChinnus
File System
However, as a logical concept, a file is not stored on permanent media. Therefore,
after saving all the work, a file needs to be mapped on to the storage device.
(Note: This is a background work which is not seen by the user)
The system views all the work required to map the logical file to the secondary
storage.
The OS abstracts the actual storage of the program and data from the user and
provides a logical and convenient file concept.
11/10/2014
ChonuChinnus
File Structure
The basic element of data, field, is single valued item, for example, name, date,
employee ID and so on.
It is characterized by its length and data type.
When multiple fields are combined to form a meaningful collection, it is known
as record.
For example, a students record can be one record, consisting of fields such as
roll number, name, qualification and so on.
When such similar records are collected, it is known as a file.
File also has a name similar to a field or record. Thus, a file is treated as a single
entity that may be used by a programmer or application.
The files can also be flat in the form of an unstructured sequence of bytes, that
is structure has no fields or records. Thus, a file, composed of bytes, has no
records, and is looked upon as a sequence of bytes by the programs that use it.
UNIX and Windows use this kind of a file structure.
11/10/2014
ChonuChinnus
File Structure
But, the OS must support a required structure for a certain type of file. For
example, an executable file must have a defined structure, so that the OS can
determine the location in the memory to load the file and locate first instruction
to be executed.
Roll No
Name
Fields
Class
Address
Record
Roll No
Name
Class
Address
11/10/2014
File
ChonuChinnus
File Structure:
Locating an offset within a logical file may be difficult for an OS. Since the logical
file will be mapped to the secondary storage, it is better to define the internal
structure of a file in terms of the units of secondary storage.
In general, the disk is used for secondary storage and hence block is unit token
for the storage.
The block unit needs to mapped to the logical file structure as well.
For example, a file is considered a stream of bytes in UNIX. Each byte in the file
can be found having a start address of the file.
Consider the logical record size as 1 byte. Now assume the packing of some bytes
or records in the file into the disk blocks. For example, a group of 512 bytes is
packed into one block of disk. In this way, a file may be considered as a
sequence of blocks and therefore, all basic I/O functions are performed in terms
of blocks. This is known as Record Blocking.
11/10/2014
ChonuChinnus
File Structure:
The larger the size of a block, the more number of records will be mapped on to
the block of disk. In turn, the large number of bytes will be transferred in one
I/O operation. This is advantageous, in case the file is searched sequentially,
thereby, reducing the number of I/O operations as well.
Another issue regarding record blocking is whether the blocks should be fixed or
variable-size.
Based on this issue, there are three methods of blocking
Fixed Blocking
Variable-length Spanned Blocking
Variable-length Unspanned Blocking
11/10/2014
ChonuChinnus
File Structure:
Fixed Blocking: In this method, fixed sizes records are used. Therefore there may
be a mismatch between the sizes of records and blocks, leaving some unused
space in the blocks. This causes internal fragmentation. In the fig below R1, R2,
R3 fit in the fixed blocks but R4 does not, leaving some space in the last block.
The space may be left unused, if there is not enough space to allocate a block at
the end of the tract of the disk space. This method is advantageous when
sequential files are used.
R1
R2
R3
R4
R2
R3
R3
R4
ChonuChinnus
File Structure:
11/10/2014
R2
R3
ChonuChinnus
R4
11/10/2014
ChonuChinnus
10
Executable File: When an object file has been linked properly and is ready to
run, it is known as an executable file. Its extension may .exe, .com, .bin etc
Text File: A general text format-level document is known as a text file. Its
extension is .txt.
Batch File: It is a file consisting of some commands to be executed and given to
command interpreter. Its extension is .bat.
11/10/2014
ChonuChinnus
11
ChonuChinnus
12
File Attributes
Besides the name and data, a file has other attributes as well. These attributes
vary from system to system.
But some of the attributes are very common in every system, for example, data
and time of creation of a file. Some attribute types are as following:
General Information: Some attributes of a file are general. For example, name,
type, location, size, time and date of creation.
Protection-related Attributes: A file may be enabled with access protection. Users
cannot access it in their own way. Before accessing, they must know its access
rights. For example, read, write and execute permissions. Password to the file
and creator/owner of the file also contribute to protection attributes.
Flags: Some flags control or enable some specific property of a file. Some of them
are:
1. Read-only Flag: It is used for making a file read-only. It is 0 for read/write and
1 for read-only.
11/10/2014
ChonuChinnus
13
File Attributes
2. Hidden Flag: It is used to hide a file in the listing of the files. It is set for hiding
the file, otherwise, the file is displayed.
3. System Flag: It is used to designate a file as system file. It is set for making a
file system file, otherwise, the file is a normal one.
4. Archive Flag: It is used to keep track of whether the file has been backed up or
not. The OS sets it whenever a file is changed. The flag is 0, when the changed
file has been backed up.
5. Access Flag: It is used to convey how the file is accessed. It is set when the file
is accessed randomly, otherwise, the file is accessed sequentially.
Time of Last Change and Last Access: It is used to provide information about the
time when the file was last modified, and when it was last accessed.
11/10/2014
ChonuChinnus
14
File Operations
A file is of an abstract data type, the kind of operations that can be performed on
it must be known. The OS provides system calls for each operation to be
implemented on the file. The following are the operations that are performed on a
file:
Create a File: It is a file creation operation. Two steps are necessary to create a
file:
1.
2.
Write a File: The write operation needs the name of the file and the data to be
written. The OS must have a pointer in the file for reading and writing. The
system must keep a write pointer to the location in the file where the next write
is to take place. The write pointer must be updated whenever a write occurs.
11/10/2014
ChonuChinnus
15
File Operations
Saving a File: The contents of the file must be saved on the disk. For this, the OS
must look for space on the disk and then save it. The appropriate entry in the
directory, where the file is created, is also done.
Deleting a File: To delete a file, we search the directory for the named file.
Having found the associated directory entry, we release all file space, so that it
can be reused by other files, and erase the directory entry.
Open a File: Before a process uses a file for any operation, the file must be
opened. The OS fetches its attributes and list of disk addresses into the main
memory for quick access to open the file.
Close a File: A file, when not needed, for any access may be closed. The close
operation frees memory space for attributes and disk addresses. Closing the file
doesnt mean deleting it as the file has not been removed from the disk.
11/10/2014
ChonuChinnus
16
File Operations
Read a File: To read from a file, we use a system call that specifies the name of
the file and where (in memory) the next block of the file should be put. The
system needs to keep a read pointer to the location in the file where the next
read is to take place.
1.
Since a process is usually either reading from or writing to a file, the current operation
location can be kept as a per-process current-file-position pointer.
2.
Both the read and write operations use this same pointer, saving space and reducing system
complexity.
Append a File: This is similar to the writing a file with a difference that this
operation is performed only at the end of the file. The OS locates the end of the
file, using the pointer, and then appends the data to be written in the file.
Repositioning the Current Position Pointer: The directory is searched for the
appropriate entry, and the current-file-position pointer is repositioned to a given
value. Repositioning within a file need not involve any actual I/O. This file
operation is also known as a file seek.
11/10/2014
ChonuChinnus
17
ChonuChinnus
18
11/10/2014
ChonuChinnus
19
File Access
The files stored on the disk are required to be retrieved by the user. But there are
many ways to access a file. The file access depends on the blocking strategy on
the disk and the logical structuring of records. The following are some file access
methods:
Sequential File Access: The file is accessed sequentially, i.e., the information in
the files is accessed in the order it is stored in the file. Information in the file is
processed in order, one record after the other.
Editors and compilers usually access files sequentially.
Reads and writes make up the bulk of the operations on a file.
A read operation: readnext(): reads the next portion of the file and automatically
advances a file pointer, which tracks the I/O location.
Similarly, the write operation: writenext(): appends to the end of the file and
advances to the end of the newly written material (the new end of file).
11/10/2014
ChonuChinnus
20
File Access
Such a file can be reset to the beginning, and on some systems, a program may
be able to skip forward or backward n records for some integer n and perhaps
only for n= 1.
11/10/2014
ChonuChinnus
21
File Access
Direct Access: A file is made up of fixed-length logical records that allow
programs to read and write records rapidly in no particular order.
The direct-access method is based on a disk model of a file, since disks allow
random access to any file block.
For direct access, the file is viewed as a numbered sequence of blocks or records.
Thus, we may read block 14, then read block 53, and then write block 7. There
are no restrictions on the order of reading or writing for a direct-access file.
Direct-access files are of great use for immediate access to large amounts of
information. Databases are often of this type. When a query concerning a
particular subject arrives, we compute which block contains the answer and
then read that block directly to provide the desired information.
11/10/2014
ChonuChinnus
22
File Access
For the direct-access method, the file operations must be modified to include the
block number as a parameter.
Thus, we have read(n), where n is the block number, rather than readnext(), and
write(n) rather than writenext().
An alternative approach is to retain readnext()and writenext(), as with sequential
access, and to add an operation positionfile(n) where n is the block number.
Then, to effect a read(n),we would positionfile(n)and then readnext().
The block number provided by the user to the operating system is normally a
relative block number. A relative block number is an index relative to the
beginning of the file. Thus, the first relative block of the file is 0, the next is 1,
and so on, even though the absolute disk address may be 14703 for the first
block and 3192 for the second. The use of relative block numbers allows the
operating system to decide where the file should be placed and helps to prevent
the user from accessing portions of the file system that may not be part of her
file.
11/10/2014
ChonuChinnus
23
File Access
11/10/2014
ChonuChinnus
24
File Access
Indexed Access: The indexed access, is like an index in the back of a book,
contains pointers to the various blocks.
To find a record in the file, we first search the index and then use the pointer to
access the file directly and to find the desired record.
For example, IBMs indexed sequential-access method (ISAM) uses a small
master index that points to disk blocks of a secondary index. The secondary
index blocks point to the actual file blocks. The file is kept sorted on a defined
key.
To find a particular item, we first make a binary search of the master index,
which provides the block number of the secondary index.
This block is read in, and again a binary search is used to find the block
containing the desired record. Finally, this block is searched sequentially.
11/10/2014
ChonuChinnus
25
File Access
In this way, any record can be located from its key by at most two direct-access
reads.
Figure below, shows a similar situation as implemented by VMS index and
relative files.
11/10/2014
ChonuChinnus
26
Storage devices can also be collected together into RAID sets that provide
protection from the failure of a single disk . Sometimes, disks are subdivided and
also collected into RAID sets.
Partitioning is useful for limiting the sizes of individual file systems, putting
multiple file-system types on the same device, or leaving part of the device
available for other uses, such as swap space or unformatted (raw) disk space.
A file system can be created on each of these parts of the disk. Any entity
containing a file system is generally known as a volume. The volume may be a
subset of a device, a whole device, or multiple devices linked together into a
RAID set.
11/10/2014
ChonuChinnus
27
Each volume that contains a file system must also contain information about the
files in the system. This information is kept in entries in a device directory or
volume table of contents.
The device directory (or directory) records information such as name, location,
size, and type for all files on that volume.
11/10/2014
ChonuChinnus
28
Storage Structure
For example, a typical Solaris system may have dozens of file systems of a dozen
different types.
Consider the types of file systems in the Solaris OS:
tmpfs: A temporary file system that is created in volatile main memory and has its contents
erased if the system reboots or crashes.
objfs: A virtual file system (essentially an interface to the kernel that looks like a file system)
that gives debuggers access to kernel symbols.
ctfs: A virtual file system that maintains contract information to manage which processes start
when the system boots and must continue to run during operation.
lofs: A loop back file system that allows one file system to be accessed in place of another one.
procfs: A virtual file system that presents information on all processes as a file system.
ufs, zfs: General-purpose file systems.
11/10/2014
ChonuChinnus
29
Directory Overview
The directory can be viewed as a symbol table that translates file names into
their directory entries.
A symbol table is a data structure used by a language translator such as a
compiler or interpreter, where each identifier in a program's source code is
associated with information relating to its declaration or appearance in the
source, such as its type, scope level and sometimes its location.
The organization must allow us to insert entries, to delete entries, to search for a
named entry, and to list all the entries in the directory.
The operations that are to be performed on a directory:
Search for a File: We need to be able to search a directory structure to find the
entry for a particular file. Since files have symbolic names, and similar names
may indicate a relationship among files, we may want to be able to find all files
whose names match a particular pattern.
11/10/2014
ChonuChinnus
30
Directory Overview
Create a File: New files need to be created and added to the directory
Delete a File: When a file is no longer needed, we want to be able to remove it
from the directory.
List a Directory: We need to be able to list the files in a directory and the
contents of the directory entry for each file in the list.
Rename a File: Since the name of a file represents its contents to its users, we
must be able to change the name when the contents or use of the file changes.
Renaming a file may also allow its position within the directory structure to be
changed.
11/10/2014
ChonuChinnus
31
Directory Structure:
The most common schemes for defining the logical structure of a directory:
Single-Level Directory
Two Level Directory
Tree Structured Directory
Acyclic Graph Directory
General Graph Directory
11/10/2014
ChonuChinnus
32