You are on page 1of 12

SURVEY ON ALLOCATION METHODS

Introduction
Disk space in secondary storage used for physical storage of program for logical execution in
main memory. Operating system allocates space for program in physical memory. The
allocation of disk space has various phenomenon like search for range of space, indexing on
data disk, retrieving and ordering of content which enables efficient utilization of disk space.
Much of research work has been carried out earlier in terms of ordered pair of sets like buddy
systems.
Earlier work carried out on Contiguous allocation, Linked allocation and Indexed allocation
where major contribution is on defining data structure and methods to enhance the effective
utilization of disk space. The work carried on various allocation methods since 1970 given in
detail as shown below:
Prior to 1970, Disc allocation was administrated by means of a map held in core, with one
bit per disc page managed by a method found by J.L. Smith.
B. J. Austin retains all of Smith's procedure, except that the technique for searching the map
has been improved. Allocation was dynamic that space given to file only as needed, not being
reserved in advance. List also maintained a list of hole's in the disc map. This list was added
to whenever a search for allocation found a string of adjacent free pages whose length was
less than required for request in hand.
File loading experiments have shown that proposed allocation algorithm has 80 tables when
compared to previous allocation algorithms has 540 page tables. A programmer may produce
a file in which the logical addressing space is non-contiguous. Such a file always has a page
table. Time for file loading process has been considerably reduced due to causes; the number
of disc operations is reduced, the time spent searching the map is reduced.
The dumping time and reloading of file storage takes a little less than hours of machine timeThe load takes about three quarters of an hour and the dump, which involves verification of
the tapes produced, takes about an hour. With previous algorithm the number of page tables
immediately after file load was 500 to 1,000 and during a day additional 100 to 200 would be
formed.

The System no longer spends a large amount of time searching the allocation map when the
disc is practically full, but no quantitative evaluation of this effect is available. Algorithm
seems to be satisfactory for both file loading and normal operations.
D.E. Gold et.,al proposed a basic permutation algorithm and its variations. Their work deals
with masking rotational latency. Algorithm for permutation of blocks in memory hierarchy
has been proposed like Slow-to Fast Transition algorithm, Fast to Slow transition algorithm,
Special cases of Permutation are addition and deletion of blocks. Arbitrary permutation
algorithm, Standard permutation algorithm Permutation assignment algorithms were
proposed. The preceding method is implemented for a head per- track disk system by
performing row permutations in intermediate (bulk storage) random access memory as
described in the earlier example. The first row permutation is accomplished by accumulating
blocks in an output buffer as they come from the primary memory. When a row is
accumulated, these blocks start to be output to the disk in their permuted order. The final row
permutations are performed similarly by accumulating a row of blocks in the input buffer
before transmission to primary memory in permuted order.
Howard Lee Morgan proposed an optimization model for the assignment of flies to disk
packs, and packs to either resident or nonresident status is presented. Heuristics are suggested
for those cases in which it is inefficient to compute the actual optimum.
When the amount of space required for file storage exceeds the amount which can be kept
online, decisions must be made as to which files are to be permanently resident and which
mountable. These decisions will affect the number of mount requests issued to the operators.
This is often a bottleneck in a computing facility, and reducing the number of mounts thus
decreases turnaround time.
In summary, it is clear that the problem of optimally using disk storage devices is a complex
one, effects contributed by queuing and scheduling as well as space allocation. This paper has
attempted to describe where space allocation may fit in, and to prescribe some methods for
handling this important problem. Test cases of multiple Knapsack algorithms executed.
Kenneth K. Shen et...al, presented an extension of the buddy method, called the weighted
buddy method, for dynamic storage allocation is presented. The weighted buddy method
allows block sizes of 2k and 3.2k, whereas the original buddy method allowed only block sizes
of 2k. This extension is achieved at an additional cost of only two bits per block. Simulation
2

results are presented which compare this method with the buddy method. These results
indicate that, for a uniform request distribution, the buddy system has less total memory
fragmentation than the weighted buddy algorithm. However, the total fragmentation is
smaller for the weighted buddy method when the requests are for exponentially distributed
block sizes.
James A. Hinds proposed a simple scheme for the determination of the location of a block of
storage relative to other blocks is described. This scheme is applicable to buddy type storage
allocation system.
Warren Burton worked on generalization of the buddy system for storage allocation. The set
of permitted block sizes {SIZEi}i=0 to n must satisfy the condition SIZE~ SIZE~_I ~SIZE~_k~) where k may be any meaningful integral-valued function. This makes it possible
to force logical storage blocks to coincide with physical storage blocks, such as tracks and
cylinders.
James L.Peterson presented two algorithms for implementing any of a class of buddy
systems for dynamic storage allocation. Each buddy system corresponds to a set of recurrence
relations which relate the block sizes provided to each other. Analyses of the internal
fragmentation of the binary buddy system, the Fibonacci buddy system, and the weighted
buddy system are given. Comparative simulation results are also presented for internal,
external, and total fragmentation.
The total fragmentation of the weighted buddy system is generally worse than that of the
Fibonacci buddy binary system. The total fragmentation of the binary buddy Fibonacci
system varies widely because of its internal fragmentation characteristics. Still the variation
among these buddy systems is not great, and the lower execution time of the binary buddy
would therefore seem to recommend it for general use, although the execution time of the
Fibonacci buddy system is not much greater. The weighted buddy system seems to be less
desirable than either the binary or the Fibonacci system owing to its higher execution time
and greater external fragmentation,

In conclusion then, we would recommend that the

Fibonacci memory management module of a system be constructed as either a binary or


Fibonacci buddy system
H. C. DU etal presented cartesian product files that have been shown to exhibit attractive
properties for partial match queries. This paper considers the file allocation problem for
3

Cartesian product files, which can be stated as follows: Given a k-attribute Cartesian product
file and an m-disk system, allocate buckets among the m disks in such a way that, for all
possible partial match queries, the concurrency of disk accesses is maximized. The Risk
Modulo (DM) allocation method is described first, and it is shown to be strict optimal under
many conditions commonly occurring in practice, including all possible partial match queries
when the number of disks is 2 or 3. It is also shown that although it has good performance,
the DM allocation method is not strict optimal for all possible partial match queries when the
number of disks is greater than 3. The General Disk Modulo (GDM) allocation method is
then described, and a sufficient but not necessary condition for strict optimality of the GDM
method for all partial match queries and any number of disks is then derived. Simulation
studies comparing the DM and random allocation methods in terms of the average number of
disk accesses, in response to various classes of partial match queries, show the former to be
significantly more effective even when the number of disks is greater than 3, that is, even in
cases where the DM method is not strict optimal. The results that have been derived formally
and shown by simulation can be used for more effective design of optimal file systems for
partial match queries. When considering multiple-disk systems with independent access
paths, it is important to ensure that similar records are clustered into the same or similar
buckets, while similar buckets should be dispersed uniformly among the disks.
Free disk space Management Matthew S. Hecht et..al, Schemes for managing free disk
pages are so widely known that they must be considered part of the folklore of computer
science. Two popular data structures are bitmaps and linked lists. The bitmap has bit position
i set when disk page number i is free, and cleared when disk page number i is in use. The
linked list contains the page numbers of free disk pages. Less widely known, however, are
variations of such schemes that preserve the consistency of these data across failures. While
recoverable schemes for managing free disk pages are all based on the principle of
maintaining a copy (complete, or incremental with a base) on memory media with an
independent failure mode, the details of such schemes vary considerably. The general
problem we consider here is how to make the free-disk-space data structures survive two
kinds of failures: (1) failure of main memory (e.g., loss of power) resulting in loss of its
contents, and (2) failure of a disk transfer resulting in an unreadable disk page. This paper
presents a programming technique, using a linked list for managing the free disk pages of a
file system and using shadowing (also known as careful replacement [lo]) for failure
recovery, which enjoys the following properties:
4

(1) The state of allocation at the previous checkpoint (a consistent system state preserved on
disk) is always maintained on the disk.
(2) The data structure describing free space on disk is never copied during a checkpoint or
recover (from a main-memory failure) operation.
(3) A window of only two pages of main memory is required for accessing and maintaining
the data describing free space.
(4) System information need be written to disk only during a checkpoint, rather than every
time it changes.
Lorie [7] describes a scheme similar to ours that uses bitmaps and shadowing. Gray [l, 21
describes the update-in-place with logging paradigm that can be applied to the problem of
managing free disk pages across failures. Sturgis, Mitchell, and Israel [9] (see also Mitchell
and Dion [8]) describe an abstraction called stable storage whereby a page-allocation bitmap
is recorded redundantly on disk; the second page is not written until the first has been written
successfully.
Distributed File System Bruce Walker et..al, presented LOCUS Is a distributed operating
system which supports transparent access to data through a network wide flle system, permits
automatic replication of storage supports transparent distributed process execution, supplies a
number of high reliability functions such as nested transactions, and is upward compatible
with Unix. Partitioned operation of subnet and their dynamic merge is also supported. The
system has been operational for about two years at UCLA and extensive experience In its use
has been obtained. The complete system architecture is outlined in this paper, and that
experience is summarized. The most obvious conclusion to be drawn from the LOCUS work
is that a high performance, network transparent, distributed file system which contains all of
the various functions indicated throughout this paper, is feasible to design and implement,
even in a small machine environment. Replication of storage is valuable, both from the user
and the system's point of view. However, much of the work is in recovery and in dealing with
the various races and failures that can exist. Nothing is free. In order to avoid performance
degradation when resources are local, the cost has been converted into additional code and
substantial care in implementation architecture. LOCUS is approximately a third bigger than
Unix and certainly more complex. The difficulties involved in dynamically reconfiguring an
5

operating system are both intrinsic to the problem, and dependent on the particular system.
Rebuilding lock tables and synchronizing processes running in separate environments are
problems of inherent difficulty. Most of the system dependent problems can be avoided,
however, with careful design. The fact that LOCUS uses specialized protocols for operating
system to operating system communication made it possible to control message traffic quite
selectively. The ability to alter specific protocols to simplify the reconfiguration solution was
particularly appreciated. The task of developing a protocol by which sites would agree about
the membership of s partition proved to be surprisingly difficult. Balancing the needs of
protocol synchronization and failure detection while maintaining good performance presented
a considerable challenge. Since reconfiguration software is run precisely when the network is
flaky, those problems are real, and not events that are unlikely. Nevertheless, it has been
possible to design and implement a solution that exhibits reasonably high performance.
Further work is still needed to assure that scaling to a large network will successfully
maintain that performance characteristic, but our experience with the present solution makes
us quite optimistic. In summary, however, use of LOCUS indicates the enormous value of a
highly transparent, distributed operating system. Since file activity often is the dominant part
of the operating system load, it seems clear that the LOCUS architecture, constructed on a
distributed file system base, is rather attractive.
PHILIP D. L. KOCH presented the buddy system is known for its speed and simplicity.
However, high internal and external fragmentation have made it unattractive for use in
operating system file layout. A variant of the binary buddy system that reduces fragmentation
is described. Files are allocated on up to t extents, and inoptimally allocated files are
periodically reallocated. The Dartmouth Time-Sharing System (DTSS) uses this method.
Several installations of DTSS, representing different classes of workload, are studied to
measure the methods performance. Internal fragmentation varies from 2-6 percent, and
external fragmentation varies from O-10 percent for expected request sizes. Less than 0.1
percent of the CPU is spent executing the algorithm. In addition, most files are stored
contiguously on disk. The mean number of extents per file is less than 1.5, and the upper
bound is t. Compared to the tile layout method used by UNIX, the buddy system results in
more efficient access but less efficient utilization of disk space. As disks become larger and
less expensive per byte, strategies that achieve efficient I/O throughput at the expense of
some storage loss become increasingly attractive.

PHILIP D. L. KOCH presented that the purpose of a distributed file system (DFS) is to
allow users of physically distributed computers to share data and storage resources by using a
common file system. A typical configuration for a DFS is a collection of workstations and
mainframes connected by a local area network (LAN). A DFS is implemented as part of the
operating system of each of the connected computers. This paper establishes a viewpoint that
emphasizes the dispersed structure and decentralization of both data and control in the design
of such systems. It defines the concepts of transparency, fault tolerance, and scalability and
discusses them in the context of DFSs. The paper claims that the principle of distributed
operation is fundamental for a fault tolerant and scalable DFS design. It also presents
alternatives for the semantics of sharing and methods for providing access to remote files. A
survey of contemporary UNIX@-based systems, namely, UNIX United, Locus, Sprite, Suns
Network File System, and ITCs Andrew, illustrates the concepts and demonstrates various
implementations and design alternatives. Based on the assessment of these systems, the paper
makes the point that a departure from the approach of extending centralized file systems over
a communication network is necessary to accomplish sound distributed file system design.
Khaled A. S. Abdel-Ghaffar, presented a coding-theoretic analysis of the disk allocation
problem. We have shown the equivalence of the problem of strictly optimal disk allocation
and the class of MDS codes. One main open problem in this area is the development of tight
necessary and sufficient conditions for the existence of optimal disk allocation [8]. These
results formalize the intuitive ideas developed by Faloutsos and Metaxas [6], as well as
extend and generalize several other previous results, especially those presented by Sung [18].
Using coding theory, we have determined this minimum number for binary Cartesian product
files that have up to 16 attributes, assuming that the number of disks is a power of 2.
Sunil Prabakar et..al, presented a new scheme which provides good declustering for
similarity searching. In particular, it does global declustering as opposed to local declustering,
exploits the availability of extra disks and does not limit the partitioning of the data space.
Our technique is based upon the Cyclic declustering schemes which were developed for
range and partial match queries. We establish, in general, that Cyclic declustering techniques
outperform previously proposed techniques. The problem of efficient similarity searching is
becoming important for databases as non-textual information is stored. The problem reduces
to one of finding nearest-neighbors in high-dimensional spaces. In this paper, a new disk
allocation method for declustering high-dimensional data to optimize nearest-neighbor
7

queries is developed. The new scheme, called cyclic allocation, is simple to implement and is
a general allocation method in that it imposes no restrictions on the partitioning of the data
space. Furthermore, it exploits the availability of any number of disks to improve
performance. Finally, by varying the skip values the method can be adapted to yield
allocations that are optimized for various criteria. We demonstrated the superior performance
of the cyclic approach compared to existing schemes both those that were originally designed
for range queries (FX, DM and HCAM) as well as those designed specifically for nearestneighbors (NOD). The FX and DM schemes are found to be inappropriate for nearestneighbor queries. HCAM performs reasonably well for odd numbers of disks, but extremely
poorly for even numbers. NOD was found not to achieve as much parallelism as Cyclic for
most cases, except when retrieving only direct neighbors with a small number of disks. NOD
also has the potential to give better performance for some dimensions when the number of
disks is close to that required to achieve near-optimality. On the other hand, NOD is restricted
to 2-way partitioning of each dimension, and its cost remains the same even when more disks
beyond those required for near-optimal declustering are available. This results in a saturation
of the gains produced by NOD beyond this point. In contrast, the Cyclic approach is not
restricted to 2-way partitioning and makes use of all available disks. In fact, its cost tracks the
lower bound and reduces as the number of disks increases. Overall we observe that the Cyclic
scheme gives the best performance for nearest-neighbor queries more consistently than any
other scheme. Given the success of the cyclic schemes for two dimensional range queries
[PAAE98], and the flexibility for nearest-neighbor queries, we expect that it will give good
performance for systems that require both types of queries.
Author
B.J.Austin

Methodologies
Allocation map a dynamic allocation algorithm

Results
80 tables compared to previous
allocation algorithm has 540 tables
D.E. Gold et..al,
Permutation algorithm
Latency may be masked using a small
Arbitrary permutation algorithm, Standard amount of buffer memory
permutation algorithm Permutation assignment
algorithms
Howard
Lee Multiple Knapsack algorithm
Minimizes the expected number of
Morgan et..al
mount requests which must be
satisfied to process a set of jobs.
Kenneth K. Shen Weighted Buddy Method
Uniform request distribution, the
et...al
buddy system has less total memory
fragmentation than the weighted
buddy algorithm
8

James A. Hinds.
Warren Burton
James L.Peterson

Relative block storage scheme


Generalization of the buddy system
Fibonacci buddy system

The total fragmentation varies


because of internal characteristics
H. C. DU etal
Cartesian product files
Effective design of optimal file
system for partial match queries.
Phlip D. L. Koc
Variant of binary buddy system
Reduces fragmentation.
Khaled A. S. Abdel- Coding theoretic analysis of disk allocation Response time of disk has improved.
Ghaffar
problem
Sunil
Prabakar Cyclic de clustering
Cyclic
scheme
gives
best
et..al,
performance for nearest neighbor
queries more consistently than any
other scheme.
Table 1 Research work on disk allocation algorithm
*The reference [1] [13] are on disk allocation methods, the rest of papers are on file
systems.
REFERENCES
1. Gold, D. E., & Kuck, D. J. (1974). A model for masking rotational latency by
dynamic

disk

allocation.

Communications

of

the

ACM,

17,

278288.

doi:10.1145/360980.361006.
2. Morgan, H. L. (1974). Optimal space allocation on disk storage devices.
Communications of the ACM, 17(3), 139142. doi:10.1145/360860.360867.
3. Shen, K. K., & Peterson, J. L. (1974). A weighted buddy method for dynamic storage
allocation. Communications of the ACM, 17, 558562. doi:10.1145/355620.361164
4. Burton, W. (1976). A buddy system variation for disk storage allocation.
Communications of the ACM, 19(7), 416417. doi:10.1145/360248.360259
5. Peterson, J. L., & Norman, T. a. (1977). Buddy systems. Communications of the ACM,
20, 421431. doi:10.1145/359605.359626
6. Du, H. C., & Sobolewski, J. S. (1982). Disk allocation for Cartesian product files on
multiple-disk systems. ACM Transactions on Database Systems, 7(1), 82101.
doi:10.1145/319682.319698

7. Hecht, M. S., & Gabbe, J. D. (1983). Shadowed management of free disk pages with a
linked

list.

ACM

Transactions

on

Database

Systems,

8(4),

503514.

doi:10.1145/319996.320002
8. Distributed Operating System I Bruce Walker , Gerald Popek , Robert English ,
Charles Kline and Greg Thiel 2 University of California at Los Angeles. (1983), 49
70.
9. Koch, P. D. L. (1987). Disk file allocation based on the buddy system. ACM
Transactions on Computer Systems, 5(4), 352370. doi:10.1145/29868.29871
10. Levy, E., & Silberschatz, a. (1990). Distributed File Systems: Concepts and
Examples, 22(4), 321374.
11. Abdel-Ghaffar, K. a. S., & El Abbadi, A. (1993). Optimal disk allocation for partial
match queries. ACM Transactions on Database Systems, 18(1), 132156.
doi:10.1145/151284.151288
12. Krieger, O., & Stumm, M. (1997). HFS: a performance-oriented flexible file system
based on building-block compositions. ACM Transactions on Computer Systems,
15(3), 286321. doi:10.1145/263326.263356
13. Prabhakar, S., Agrawal, D., El, A., & Barbara, S. (1998). Efficient Disk Allocation for
Fast Similarity Searching Divyakant Agrawal Amr El Abbadi University of
California, 7887.
14. Hess, C. K., & Campbell, . R. H. (2003). An application of a context-aware file
system, 339352. doi:10.1007/s00779-003-0250-y
15. Ghemawat, S., Gobioff, H., & Leung, S.-T. (2003). The Google file system. ACM
SIGOPS Operating Systems Review, 37, 29. doi:10.1145/1165389.945450
16. Gal, E., & Toledo, S. (2005). Algorithms and data structures for flash memories. ACM
Computing Surveys, 37(2), 138163. doi:10.1145/1089733.1089735

10

17. Sivathanu, M., Prabhakaran, V., Arpaci-Dusseau, A. C., & Arpaci-Dusseau, R. H.


(2005). Improving storage system availability with D-GRAID. ACM Transactions on
Storage, 1(2), 133170. doi:10.1145/1063786.1063787.
18. Kang, S., & Reddy, a. L. N. (2006). An approach to virtual allocation in storage
systems.

ACM

Transactions

on

Storage,

2(4),

371399.

doi:10.1145/1210596.1210597
19. Wang, A.-I. A., Kuenning, G., Reiher, P., & Popek, G. (2006). The Conquest file
system. ACM Transactions on Storage, 2(3), 309348. doi:10.1145/1168910.1168914
20. Agrawal, N., Bolosky, W. J., Douceur, J. R., & Lorch, J. R. (2007). A Five-Year Study
of File-System Metadata, 3(3).
21. Cipar, J., Corner, M. D., & Berger, E. D. (2007). Contributing storage using the
transparent

file

system.

ACM

Transactions

on

Storage,

3(3),

12es.

doi:10.1145/1288783.1288787
22. Batsakis, A., Burns, R., Kanevsky, A., Lentini, J., & Talpey, T. (2009). Ca-Nfs. ACM
Transactions on Storage, 5(4), 124. doi:10.1145/1629080.1629085
23. Thomasian, A., & Blaum, M. (2009). Higher reliability redundant disk arrays. ACM
Transactions on Storage, 5(3), 159. doi:10.1145/1629075.1629076
24. Ryu, S., Lee, C., Yoo, S., & Seo, S. (2010). Flash-aware cluster allocation method
based on filename extension for FAT file system. Proceedings of the 2010 ACM
Symposium on Applied Computing - SAC 10, 502. doi:10.1145/1774088.1774192
25. Jung, J., Won, Y., Kim, E., Shin, H., & Jeon, B. (2010). Frash. ACM Transactions on
Storage, 6(1), 125. doi:10.1145/1714454.1714457
26. Shin, D. I., Yu, Y. J., Kim, H. S., Eom, H., & Yeom, H. Y. (2011). Request Bridging
and

Interleaving.

ACM

Transactions

doi:10.1145/1970348.1970349

11

on

Storage,

7(2),

131.

27. Wu, X., & Reddy, a. L. N. (2011). SCMFS: A file system for Storage Class Memory.
2011 International Conference for High Performance Computing, Networking,
Storage and Analysis (SC), 9(3), 111. doi:10.1145/2063384.2063436
28. Hsieh, J.-W., Wu, C.-H., & Chiu, G.-M. (2012). Mftl. ACM Transactions on Storage,
8(2), 129. doi:10.1145/2180905.2180908
29. Paulo, J., & Pereira, J. (2014). A Survey and Classification of Storage Deduplication
Systems. ACM Computing Surveys, 47(1), 130. doi:10.1145/2611778

12

You might also like