Professional Documents
Culture Documents
INTRODUCTION TO DBMS
INTRODUCTION
We need to understand the relevance and scope of Database in the Data processing area.
This we do by first understanding the properties and characteristics of data and the nature of data
organization. Further we look at the various objectives of Databases technology and its
characteristics by studying different packages available in the market.
Data structure can be defined as specification of data. Different data structures like array,
stack, queue, tree and graph are used to implement data organization in main memory. Several
strategies are used to support the organization of data in secondary memory. In this unit we will
look at different strategies available for organizing data in secondary memory. We will also learn
about data representation for files in external storage devices, so that required operations (e.g.
retrieval, update) may be carried out efficiently.
OBJECTIVES:
Following are the objectives of Database management Systems.
• SHAREABILITY
• AVAILABILITY
• EVOLVABILITY
• ADAPTABILITY
• INTEGRITY
i . ex e
Unit- 4 INTRODUCTION TO DATABASE CAM (MBA-1st Sem)
SHAREABILITY
This list indicates some of the additional problems which arise in managing shared data. A
central implication of sharing is that compromise will often be required between conflicting user
needs as, for example, in the establishment of a data structure and corresponding storage
structure.
AVAILABILITY
Availability means bringing the data of an organization to the users of that data. They system
which manages data resources should be easily accessible to the people within n organization –
making the data available when and where it is needed, and in the manner and form in which it is
needed. Availability refers to both the data and the DBMS which delivers the data. Availability
functions make the database available to users: defining and creating a database, and getting data
in and out of a database. These are the direct functions performed by a DBMS. A DBMS should
accommodate diversity in the data stored.
The bulk of organization data, as traditionally handled in accounting systems, lied in the
enclosed region of historical, internal, financial data. A database management system must be
capable of reaching beyond this region to handle greater diversity in the data stored, including
subjective data, fragmentary marketing intelligence data, uncertain forecasts and aggregated
data, as well as factual marketing, manufacturing, personnel and accounting data.
EVOLVABILITY
Evolvability refers to the ability of the DBMS to change in response to growing user needs and
advancing technology. Evolvability is the system characteristic that enhances future availability
of the data resources. Evolvability is not the same as expandability or extensibility, which imply
extending or adding to the system, which then grows ever larger. Evolvability covers expansion
or contraction, both of which may occur as the system changes to fit the ever changing needs and
desires of the using environment.
INTEGRITY
The importance and pervasiveness of the need to maintain database integrity is rooted in the
reality that man is perfect. Destruction, errors and improper disclosure must be anticipated and
explicit mechanisms provided for handling them. The three primary facets of database integrity
are:
In developing DBMSs, the accountant’s concept of internal control has been practically ignored.
Computer specialists need such concepts to improve database integrity and enhance management
confidence.
Apart of these objectives there are several other objectives which are as follows.
Database management systems are programs that are written to store, update, and retrieve
information from a database. There are many databases available in the market. The most
popular are the Oracle and SQL Server. The Oracle database is from the Oracle Corporation and
the SQL Server is from the Microsoft Corporation. There are freely available database like
MySQL. These are open source databases. Database Management Systems are available for
personal computers and for huge systems like mainframes. DB2 is a database from IBM for
Mainframe systems.
Structured Query Language is used for querying the databases. Variations of this structured
query language in the form of T-SQL and PL-SQL are available. The data that is available in the
database is represented in various formats. Usually a report writer program is bundled with the
database for generating reports. Crystal reports is one such application that is bundled with SQL
Server 2000 and later versions of it. These report generating programs makes it easy for
generating any kind of report based on the data that is available in the database. Graphics
components are also available in the database management systems to generate reports in the
form of charts and graphs.
Products like Quest can be used in conjunction with your database to get more out of your
database. The database quality and performance can be improved by using the productivity tools
provided for the DBAs by Quest. This product enables you to develop and test SQL code for
optimum quality and performance for an application before it ever reaches the production
environment. Business threatening performance issues before they reach the end user level are
detected and diagnosed. These issues are resolved as early as possible without interrupting the
business. Monitoring, diagnostics, tuning, space management and high availability are the
solutions that are provided by Quest like products. Automation and enforcement of business
processes are done by using these products. Impact analysis, patch management, version control,
audit trial documentation and migration support are available in most of the database
management products.
Quest like products are available for database platforms like Oracle, SQL Server, DB2, MySQL
and Sybase. With its support for the above platforms it can provide uninterrupted service for
these databases. Multiple distributed SQL Server databases can be managed easily and
efficiently. Unparalleled performance and availability for vital UDB and OS/390 databases are
achieved. Tools and services to manage MySQL environments are available. Monitoring,
diagnosing, and optimizing the performance of the Sybase database can be done using the tools
in Quest.
Unicenter Database Performance is another product from Computer Associates available for
DB2 for z/OS and IMS for z/OS and distributed RDBMS. Unicenter Database Administration
solutions for the above databases are also available. Unicenter Backup and Recovery solutions
DB2 for z/OS and IMS for z/OS and distributed RDBMS are also available.
Although database management systems for these popular databases are available, there are also
small database management systems available for taking backup of your database that is used in
websites. Many such tools are available which enable you to manage your databases online
through easy to use interfaces. These products are very useful for websites that are hosted using
shared IP addresses.
All data items have certain fundamental properties. It is important to know them first in
order to create databanks. First and foremost property of the data is its form. Every data element
will have a form. Data items are classified as different data types based on their form. The form
decides the way it is stored in the computer.
DATA TYPES
Data can be classified as Numeric, Picture, Voice, Data based on its Form. The last 2 types
namely picture and voice is special forms of data and normally they are used less frequently. It is
the textual data that is very large and most used. Hence let us focus on that first. Textual data can
be numeric or alphanumeric (combination of numeric and alphabetic)
Example:
As you can notice from the examples, pure numeric data items can be classified further into
2 types. One of them is a whole number. (Like, number of students in a class, number of vehicles
in the city) These are called integers. On the other hand, we also have numeric data, which
includes fractions. (Like price of an item is 48.56, Max. Temperature today was 28.32 etc).
These data items are called Real numbers. This difference of data types namely integer and real
number is of importance to us because they are represented and manipulated differently in a
computer.
The next data type is alphabetic or alphanumeric. This type of data is made up of alphabetic
and numeric characters. (E.g.: The name of a person is HARI, the Reg.No. of vehicle is KA – 09
F-1234) This type of data may contain numbers along with alphabets but the number is not used
as a numeric data in any calculation. This data type is called a string of alphanumeric characters.
How are these data represented inside computer?
DATA REPRESENTATION
All data in computer must be represented using only 2 symbols namely 0 and 1. This
system of representation is called binary representation. In order to represent all data types in
computers using only 0 and 1, some kind of coding is needed. Integers get directly represented as
binary numbers. Real numbers are represented using a technique called Floating point
representation. Strings are represented through an elaborate coding mechanism called ASCII
(American Standard Code for Information Interchange). This coding uses 8 bits (binary digits) to
represent a character.
Example:
Even pictorial and voice data gets coded into a large number of 0’s and 1’s.
DATA SIZE
All data items do have a size. Looking at previous examples we may say Number of
Students in a class needs2 digit of space, price for an item may need 4 digit space (2 before
decimal and 2 after decimal. – decimal point need not be stored). A name string may need a
maximum number of 30 character positions. Further, when it is stored inside a computer, it may
need 30x 8 =240 bits. A picture data may need several thousand-bit positions. The property size
is of special importance to us because we need to provide adequate space to store these items in
the system. Further, DBMS packages should be able to distinguish these data types and provide
necessary functions to manipulate them.
RELATIONSHIP
Even though data items are individual entities, they never occur in isolation in the real
world. They are always associated with other data item. Ex: Data item price is related to the
vehicle in question, Date of transaction and the seller.
There are 3 different types of data relationships. Let us understand each one of them.
Simplest of all is 1 : 1 relationship. For each value of a data item there is one and only
corresponding value in the other item.
Normally all such data items are grouped and kept together as a record.
Second type of relationship is one to many (1: M). Here for every value of one data item
there are several values of the other data item. However on the reverse, several values of other
data items are related to a unique value of this data item.
E.g.: 1. A book has several chapters. But several chapters correspond to one and
only one book.
2. A person can own several vehicles; all vehicles will have only one owner.
One to many relationships can be represented in computers using pointers and arrays.
(Details later)
Third type of relationships is called Many to Many. (N: M). Most of the relationships in real
world are this type.
2. A book can have several Authors. An author might have written several
books.
This type of relationships is difficult to represent and handle in computers. Hence, as far as
possible we try to reduce them to two one to many relations (1: M and N: 1) and eliminate one
which is irrelevant to the user.
The Database must maintain all the data and their relationships and allow the user to access
data based on these relations.
E.g.: Get me all vehicles owned by a person. Get me the subjects taught by a teacher.
FIELD
Field is the next higher level of data. A field consists of grouping of characters.
E.g.: An employee’s salary is an attribute that is a typical data field associated with
the entity employee (in 1: 1 relation)
RECORD
Related data fields are grouped to form a RECORD. A record thus is a collection of
attributes that describe an entity.
E.g.: 1. An employee record could consists of attributes like, his ID, name and salary
he draws etc.
2. Set of subjects taught for a class during each hour.
FILE
E.g.: 1. A group of all employee records showing one record for each employee
could be an employee file. Files are frequently classified by application for
which they are used.
2. Timetable for a class for a week showing subjects taught each hour on each
day of the week.
Files are frequently classified by the application for which they are primarily used such as
payroll file, Inventory file etc.
DATABASE
E.g.: 1. The timetable for an entire school showing the details of classes, subjects,
room, teacher's etc.
A Personnel database consolidates data files like, Payroll files, Personnel action files,
employee skill files etc.
Payroll Inventory
File File
D A T A B A S E
A p p l i c a t i o n
U s e r
P r o g r a m m e s
A p p l i c a t i o n
P r o g r a m m e s U s e r
A p p l i c a t i o n
P r o g r a m m e s U s e r
Creation of database involves specifying data types, structures and their relationship
constraints for the data stored in database.
Maintenance of database includes such functions as updating and accessing the data in the
database to reflect changes in the real world.
E.g.: Let us consider a college environment, wherein we need to maintain data about
class scheduling. Data like
The basic entities in this example are subjects, courses, teachers, rooms, student's etc.;
there will be associations or relationships linking these entities.
A teacher may teach several subjects. Several teachers may teach a subject.
COMPONENTS OF DBMS
DBMS packages on personal computers allow end users to develop databases for their
personal need. They are called single user databases. However, large organizations with lot of
users usually place control of enterprise database development in the hands of the DATABASE
ADMINISTRATORS (DBA’s) and other specialists. This improves the integrity and security of
organizational databases. Database developers use DATA DEFINITION LANGUAGE (DDL) to
specify data structures, relationships and modify these structures if needed. The detailed
Users are allowed to insert, modify, delete and retrieve data from the database according
to their needs. They use DATA MANIPULATION LANGUAGE (DML) for this purpose.
Further, DBA needs to guard this database from media failures, accidental erases etc., For this
purpose, he creates copies of the databases and the changes occurring for later recovery in case
of failures. He uses DATABASE UTILITIES to handle these functions of backup and recovery.
DBMS Data
• Range of machine sizes from PC to mainframe, isolated or networked.
• DBMS runs on entire range of platforms.
11 Prep. By MOHAMMAD DANISH ( Lecturer , CSE Deptt.) MBA ,AFSET
Unit- 4 INTRODUCTION TO DATABASE CAM (MBA-1st Sem)
DBMS Hardware
DBMS Software
DBMS Users
DBMS Examples
• There are several reasons for using a DBMS that follow on from each other.
• Different models of the same data different organizations.
• Relational model is popular because it is abstract and computing evolution has always
been towards the more abstract.
• Logical organization gives a clear picture and helps programmers achieve faster
development of application programs.
• Handles low-level file maintenance.
• Yields centralization of information. This, in turn is a good thing as:
• Redundancy is eliminated
• Inconsistency is avoided
• Data is shared
• Standards are enforced
• Security is applied
• Integrity is maintained
• Requirements are balanced
• Yields data independence where data organization is not built into application programs,
for example
• Representation of numeric data
• Units for numeric data
• Data coding
• Stored record and stored file structure
• DBA can change access structures during the mid-life of the DBMS without affecting
DBMS users, except with respect to performance.
Functional organization.
*0 Does not cover many DBMS functions like concurrency, backup, security etc.
*1 Users use a language incorporating a data sublanguage for the database consisting of:
*2 Data definition language (DDL)
*3 Data manipulation language (DML)
*4 Individual user's view is an external view ... multiple occurrences of multiple types of
external records
*5 Views are defined by an external schema which is defined in DDL
*11 Defines types of stored records, indices, how fields are represented, in what
sequence, etc.
*12 Defined using an internal DDL.
*13 Programs accessing this layer are dangerous because they bypass security and
integrity checks of the internal layer.
*14 Mappings exist between the different levels of the 3LA and the DBA is
responsible for correct mapping between the levels.
TYPES OF DATABASES
OPERATIONAL DATABASES:
These databases store detailed data needed to support an entire organization. They are also
called subject area databases, (SADB) Transaction databases and Production databases. These
databases carry up-to-date information of business activities. Business supervisors in charge of
day-to-day operation most frequently use them.
ANALYTICAL DATABASES:
These databases contain information extracted from operational databases. They are used by
the managers to study the trends and patterns emerging in the business to make strategic
decisions and policy making. They are also known as Data warehouses, information Databases
and Decision support Databases. They are generally used in query mode rather than update
mode. Techniques like online Analytical Processing (OLAP) and Data Mining are used in these
databases to generate meaningful information for business analysis, market research etc,
DISTRIBUTED DATABASES:
These databases consist of a variety of data files created by end users on their PC for
personal uses. They are generally single user databases with lesser stress on backup and
recovery. The data in these databases may be generated with, word processors, spreadsheets and
other PC software packages.
MULTIMEDIA DATABASES:
These databases include non-conventional data like, pictures, voice tracks along with
conventional alphanumeric data. These databases tend to be huge in size and access is done
through specialized access language constructs. The data accessed further needs to be interpreted
and displayed by additional front-end software like Browsers and media players. From database
management viewpoint, the set of interconnected multimedia data needs to be handled as
specialized structures rather than simple records.
These databases are developed and used for certain special purpose applications. Spatial
Databases, Temporal databases Biological databases etc. belongs to this category. The data
stored in these applications are of a different kind and needs to be interpreted according to the
ground rules of those applications. Hence special techniques are used for storage and access of
data in these databases.
E-R Definitions
*0 Entity
An instance of a physical object in the real world.
*1 Entity Class
A group of objects of the same type.
*2 Attributes (Properties)
Entities have attributes or properties that describe their characteristics.
*3 Composite Attribute
An attribute that is composed of several more basic attributes.
*4 Simple Attribute
An attribute which is not divisible.
*5 Single-Valued Attribute
An attribute that has a single value for a particular entity.
*6 Multi-Valued Attribute
An attribute that has a set of values for the same entity.
*7 Value Set
Each simple attribute is associated with a value set (or domain) which specifies the
set of values that may be assigned to that attribute for each individual entity.
E-R Notation
Entity Types
Relationship Types
Attributes
Composite Attributes
Multi-valued Attributes
Key Attributes
E-R EXAMPLE
University Database
"A lecturer, identified by his or her number, name and room number, is responsible for
organising a number of course modules. Each module has a unique code and also a name and
each module can involve a number of lecturers who deliver part of it. A module is composed of a
series of lectures and because of economic constraints and common sense, sometimes lectures on
a given topic can be part of more than one module. A lecture has a time, room and date and is
delivered by a lecturer and a lecturer may deliver more than one lecture. Students, identified by
number and name, can attend lectures and a student must be registered for a number of modules.
We also store the date on which the student first registered for that module. Finally, a lecturer
acts as a tutor for a number of students and each student has only one tutor."
BANKING:
AIRLINES:
For reservations and schedule information. Airlines were among the first to
use databases in a geographically distributed manner terminals situated
around the world accessed the central database system through phone lines
and other data networks.
UNIVERSITIES:
TELECOMMUNICATION:
FINANCE:
SALES:
MANUFACTURING:
HUMAN RESOURCES:
For information about employees, salaries, payroll taxes and benifits, and
for generation of paychecks.
ADVANTAGES:
Are relatively easy to design and implement since they are normally based on a single
application or information system. The processing speed is faster than other ways of storing
data.
DISADVANTAGES:
• Program-data dependence.
• Duplication of data.
• Limited data sharing.
• Lengthy program and system development time.
• Excessive program maintenance when the system changed.
• Duplication of data items in multiple files. Duplication can affect on input, maintenance,
storage and possibly data integrity problems.
• Inflexibility and non-scalability. Since the conventional files are designed to support single
application, the original file structure cannot support the new requirements.
Today, the trend is in favor of replacing file-based systems and applications with database
systems and applications.
DATABASE APPROACH
A database is more than a file – it contains information about more than one entity and
information about relationships among the entities.
Data about a single entity (e.g., Product, Customer, Customer Order, and Department) are each
stored to a “table” in the database. Databases are designed to meet the needs of multiple users
and to be used in multiple applications.
One significant development in the more user-friendly relational DBMS products is that users
can sometimes get their own answers from the stored data by learning to use data querying
methods.
Advantages
There are three main features of a database management system that make it attractive to use a
DBMS in preference to more conventional software. These features are centralized data
management, data independence, and systems integration.
In a database system, the data is managed by the DBMS and all access to the data is through
the DBMS providing a key to effective data processing. This contrasts with conventional data
processing systems where each application program has direct access to the data it reads or
manipulates. In a conventional DP system, an organization is likely to have several files of
related data that are processed by several different application programs.
In the conventional data processing application programs, the programs usually are based on a
considerable knowledge of data structure and format. In such environment any change of data
structure or format would require appropriate changes to the application programs. These
changes could be as small as the following:
1. Coding of some field is changed. For example, a null value that was coded as -
1 is now coded as -9999.
2. A new field is added to the records.
3. The length of one of the fields is changed. For example, the maximum number
of digits in a telephone number field or a postcode field needs to be changed.
4. The field on which the file is sorted is changed.
If some major changes were to be made to the data, the application programs may
need to be rewritten. In a database system, the database management system provides
the interface between the application programs and the data. When changes are made
to the data representation, the metadata maintained by the DBMS is changed but the
DBMS continues to provide data to application programs in the previously used way.
The DBMS handles the task of transformation of data wherever necessary.
A DBMS is often used to provide better service to the users. In conventional systems,
availability of information is often poor since it normally is difficult to obtain
information that the existing systems were not designed for. Once several
conventional systems are combined to form one centralized data base, the availability
of information and its up-to-datedness is likely to improve since the data can now be
shared and the DBMS makes it easy to respond to unforeseen information requests.
Changes are often necessary to the contents of data stored in any system. These changes are
more easily made in a database than in a conventional system in that these changes do not need
to have any impact on application programs
Since all access to the database must be through the DBMS, standards are
easier to enforce. Standards may relate to the naming of the data, the
format of the data, the structure of the data etc
Since the data of the organization using a database approach is centralized and would be used
by a number of users at a time, it is essential to enforce integrity controls. Integrity may be
compromised in many ways. For example, someone may make a mistake in data input and the
salary of a full-time employee may be input as $4,000 rather than $40,000. A student may be
shown to have borrowed books but has no enrolment. Salary of a staff member in one
department may be coming out of the budget of another department. If a number of users are
allowed to update the same data item at the same time, there is a possibility that the result of the
updates is not quite what was intended. For example, in an airline DBMS we could have a
situation where the number of bookings made is larger than the capacity of the aircraft that is to
be used for the flight. Controls therefore must be introduced to prevent such errors to occur
because of concurrent updating activities.
All enterprises have sections and departments and each of these units often consider
the work of their unit as the most important and therefore consider their needs as the
most important. Once a database has been set up with centralized control, it will be
necessary to identify enterprise requirements and to balance the needs of competing
units. It may become necessary to ignore some requests for information if they
conflict with higher priority needs of the enterprise.