You are on page 1of 8

Database Systems Notes

Database – a self-describing collection of integrated records. A database contains data but also a
description of that data known as the system catalog or metadata.

One advantage of a database is the employment of data abstraction such that the addition of data
structures or the modification of data structures in a database do not impact application programs that
interface with the database (provided that the modified elements of the data structure aren’t directly
used by the application program).

Before a database is designed, the information needs of an organization need to be stated. These needs
are identified as one of three terms: entity, attribute, and relationship.

An entity is an object that is to be defined or presented in the database.

An attribute is a property of an entity worthy of recording.

A relationship is an association between entities.

These three terms are elements of an Entity-relationship diagram, which graphically represent the
structure and design of the database that will hold this information.

The database management system (DBMS) is software that allows the user to create, access, and control
databases.

Application programs communicate to the database through the DBMS utilizing a data manipulation
language (typically SQL)
The DBMS also provides a mechanism known as the view mechanism which allows the user to view only
the information relevant to them and displayed in a specific manner.

Five major components of a DBMS:

1. Hardware: The database, DBMS and application program must exist on hardware somewhere.
Ideally, the database and DBMS exist in a server in a secure location while the application
software is distributed to the client. This physical separation of client and server is known as a
client-server architecture for a database system.
2. Software: This encompasses the DBMS, the application programs and 3rd/4th generation
software utilized to improved DBMS management and application program generation.
3. Data: The data is the bridge between the people/procedures and the hardware/software. The
structure of a database is known as the schema. A database schema could consist of 4 files (or
tables) each possessing eight fields (or attributes).
4. Procedures: Instructions or rules that govern the design and use of the database.
5. People: Broken down into four types of people who interface with a database:
a. Data and database administrator: Data administrator manages the data resource:
database planning, development and maintenance of standards, policies and
procedures, and conceptual/logical database design. Consult senior management to
ensure database supports corporate objectives. Database administrator realizes the
database design and implementation, security and integrity control, maintenance and
operational system, and satisfactory user acceptance.
b. Database designer
c. Application developer
d. End-user

Database design must prioritize the data over the application program. Catering to the application
program first typically leads to structurally unsound database systems which lead to failure.

Logical database designers focus on identifying data, the relationships between the data and the
constraints applied to the data. Logical database designers work in two stages:

1. Conceptual design: target the dbms, application programs, programming languages, and any
physical consideration (all independent of implementation)
2. Logical design: targets the specific data model of the database.

Physical database designers: realize the logical design plan and implement them as tables, integrity
constraints, selecting storage structures, access methods, performance, and security measures. Must
know the DBMS extremely well.

Two end users:

1. Naïve users: unaware of the DBMS and require the application program to make the
complicated decisions while the user performs simple commands
2. Sophisticated users: understands the DBMS and are capable of using high-level programming
languages.
Relational DBMS fall short in their limited modeling capabilities of the real world. Third generation
DBMS such as object-oriented DBMS (OODBMS) and object-relational DBMS (ORDBMS) are attempting
to resolve this issue.

1990s development of XML

SAP prevoides an enterprise resource planning (ERP), and application layer built no top of a DBMS
designed to integrate all business functions of an organization.

Controlled duplication of data within a DBMS may provide performance improvements.

DBMS may implement constraints on specific records, tables, or relationships between data to enforce
the integrity of the data stored in the DBMS.

Security is enforced using username and passwords as well as account credentials. The DML statements
available to the user also limit the information observable or actions performable by the user.

2.0 DATABASE ENVIRONMENT

The starting point of the database design must be an abstract and general description of the information
requirements of the company to be represented by the database.

The three-level architecture is a methodology that represents the three ways data items can be
described with the purpose of separating the user’s view of the database from the actual representation
of the database:

1. External level: The way the user perceives the data. Only includes the entities, attributes and
relationships in the “real world” that the user is interested in.
2. Internal level: The way the DBMS and operating system perceives the data. Describes how the
data is stored in the database. Focuses on how to optimally store and retrieve data. Concerned
with storage space allocation for data and indexes, record descriptions for storage, record
placement, data compression and data encryption techniques.
3. Conceptual level: Provides the mapping and the desired independence between the external
and internal levels. Describes what data is stored in the database and the relationships among
that data. Contains all entities, their attributes, and their relationships, the constraint on the
data, the semantic information about the data, and the security and integrity information. Must
not contain storage level details.

Each user should be able to access the same data in the database but view the data differently. Users
should also be able to customize their view of the data without affecting other users.

User interactions with the database should not include (or should be independent of) the schema in
which the DBMS organizes data (such as indexing or hashing).

The administrator should be able to change the database storage structures without altering user views.
The administrator should be able to change the conceptual structure without changing the user views.

Internal structure should be unaffected by physical changes (such as changing to new storage devices)

There are three types of database schema in the three-level architecture:


1. External schema (subschema): Multiple external schema describe the different external-level
views of the data.
2. Conceptual schema: Describes all entities, attributes, and relationships together with
constraints. Only 1 per database.
3. Internal schema: Complete description of the internal model containing stored record
definitions, methods of representation, data fields, and indices of used storage structures.

The conceptual/internal mapping enables to DBMS to locate physical data based on the conceptual
model of that data.

The external/conceptual mapping relates the terminology of the user to the concepts represented in the
database.

The database schema is the description of the database and not the database itself. The schema is the
intension of what the final database description should be. Database instances are snapshots (or states)
of what the data in the database is at any moment.

Logical data independence: The immunity of the external schema to change in response to changes in
the conceptual schema.

Physical data independence: The immunity of the conceptual or external schemas to changes in the
internal schema.

Summary: The desired description of how the information in the database is present to the user should
not be affected by additions or manipulations to the concept of how the real life entities are to be
described in the DBMS. The ideal concepts should not be affected by the changes performed on the
designed organization, security, and indexing of that data on the storage devices. These separations may
create slowing database systems but allow for proper improvements/changes of a system layer without
impacting other layers of the system.

A data model is a collection of concepts that are used to describe the organization’s data, the
relationships between that data and the constraints on that data. A data model comprises of 3 parts:

1. A structural part listing a set of rules by which the database can be constructed
2. A manipulative part listing the operations allowed on the data (specifying what parts of the data
may be retrieved or manipulated)
3. A set of integrity constraints ensures the accuracy of the data.

Applying this to the ANSI-SPARC 3 level architecture yields:

1. An external data model, also called the Universe of Discourse (UoD), is design to represent each
user’s view of the data
2. A conceptual data model to represent the logical view of the data independent of the DMBS
3. An internal data model to describe the conceptual schema in a way that the DBMS understands.

Object-based data models: A modeling system that uses concepts like entity, attribute, and relationship.
The entity is a distinct object in the organization that is to be represented in the database. An attribute
is a property of any given entity worth recording in the database. A relationship defines the association
between entities. Common object-based data models are:
 Entity-Relationship models
 Semantic models
 Functional models
 Object-oriented models

MS Access utilizes the relational record-based data model to express the data stored within the
database. Data and relationships are represented as tables with distinct field names.

Conceptual database design (or conceptual modeling) is the process of constructing a model of the
conceptual schema which is the complete and accurate representation of the data requirements of the
database system (enterprise) independent of the DBMS and independent of external application
programs. The conceptual model should be the heart and source of the database system since the
external model stems from the conceptual model and the internal model is the implementation of the
conceptual model with the chosen DBMS. Database design should start by establishing the conceptual
model.

Functions of a DBMS

A DBMS should provide the following ten functions and services:

1. Data storage, retrieval, and update.


2. A user-acceptable catalog containing the descriptions of data items stored and which are user
accessible. Typically a system catalog contains:
 Names, types and sizes of data items
 Names of relationships
 Integrity constraints on the data
 Names of authorized users who may access the data
 The data items accessible to each user and the type of accessibility allowed (e.g. insert,
update, delete, or read access)
 External, conceptual and internal schemas as well as the mapping between each
 Usage statistics such as frequency of transactions
3. Transaction support ensuring all transactions are made or none of them, prevent the database
to exist in an inconsistent state.
4. Concurrency control services. These services prevent data loss during overlapping transactions
that utilize the same data/metadata.
5. Recovery services. The ability to recover the database from a consistent state during failure.
6. Authorization services. Allowing only authorized user to access the database and limiting the
viewable data based on the authorization of the user.
7. Support for data communication. User must be able to access the data in the DBMS through
communication messages. All messages are handled by a data communication manager (DCM)
which communicates with the DBMS directly.
8. Integrity services. The DBMS must ensure that both the data, changes to the data and the
database adhere to a set of rules to maintain data integrity (e.g. Staff members may not be
assigned to more than 100 properties).
9. Services to promote data independence. A DBMS must support the independence of programs
from the actual structure of the database.
10. Utility services. Services to help the DBA effectively use the DBMS such as:
 Import and export facilities
 Monitoring facilities
 Statistical analysis programs to examine performance and usage
 Index reorganization facilities to reorganize indexes and their overflow
 Garbage collection and reallocation to reallocate space on the physical storage devices
3.0 Database Architectures and the Web

File-server architecture consists of system of users connected to a server used as file storage shared
between the user computers. This file server would contain files such as documents, spreadsheets,
images and databases. This architecture uses a LAN. The file-server holds the files necessary to run
the applications and the DBMS.

File-server disadvantages:

1. Large network traffic


2. Full copy of DBMS is required on each computer
3. Concurrency, recovery, and integrity control are more complex since multiples DBMS are
accessing the same files.

Traditional two-tier client-server architecture consists of the client (tier 1) and the database server (tier
2) all within a local area network (LAN). Data-intensive business applications handle presentation of data
(user interface) and business and data application logic on the client side and data base access via
transaction logic and data validation on the server side.

The client typically requests data by issuing an SQL command to the server via user interface actions to
the application. The server interprets the request by checking authorization, ensuring integrity,
maintaining the system catalog, and performing query and update processing all while providing
concurrency and recovery control. Here the server’s responsibility is to only query information from and
maintain the database reducing network traffic and refining the functional requirements of the server.
This change also allows the client to take on more application processing responsibilities and less
database processing responsibilities.

Enterprises with extremely large client base pushed the two-tier system to the limit as applications
became more complex. Complex applications demanded a larger quantity of high spec. computers for
the client increasing cost. This demand lead to the three-tier client server architecture.

The three-tier architecture separates the business logic and data processing layer of the application
from the client computer to an application server. The client computer would then host just the user
interface and simple computations for information display. The application server would be designed to
perform complex computations quickly and handles multiple clients at once. This application server
would then communicate with the database server containing the database and performing all the data
validations and database accessing functions.

An application server hosts an application programming interface (API) to expose business logic and
business processes for use by another application. Application servers must handle the following issues:

1. Concurrency
2. Network connection management
3. Providing access to all database servers
4. Database connection pooling
5. Legacy database support
6. Clustering support
7. Load balancing
8. Failover

Typical applications servers are Java Platform, Enterprise Edition (JEE), .NET with Microsoft, and Oracle
Application Server.

A transaction processing (TP) monitor is a program that controls data transfer between clients and
servers in order to provide a consistent environment, particularly for online transaction processing
(OLTP). TP monitors assist in the routing of client requests to the appropriate DBMS. TP monitors can
allocate request for multiple heterogenous applications, balance the load of multiple clients, establish
connection to the DBMS only when requested from the client prevent slow-down for excessively
continuous underutilized connections to the DBMS. Here the TP monitor offloads processes typically
performed by the DBMS server. Examples are CICS (IBM), Tuxedo (Oracle), and Java Transaction API
(JTA).

A web service is a software system designed to support interoperable machine-to-machine interaction


over a network. Web services share business logic, data, and process through a programmatic interface
across a network designed for application interface not user interface.

The key to web service approach is to user widely accepted standards:

 XML (eXtensible Markup Language)


 SOAP (Simple Object Access Protocol) is a communication protocol for exchanging structured
information over the internet. It is platform and language independent.
 WSDL (Web Services Description Language) protocol used to describe and locate a web service.
 UDDI (Universal Discovery, Description, and Integration) protocol is a platform-independent,
XML-based registry for businesses to list themselves on the internet. UDDI protocol is designed
to provide WSDL documents.

RESTful web services moves away from SOAP based services and toward Representational State Transfer
(REST) based communication. REST services do not require XML, SOAP, WSDL, or UDDI definitions. In the
REST architecture style, data, and functionality are considered resources and are access using Uniform
Resource Identifiers (URIs), generally links on the web. Resources are acted upon by HTML operations
for create, read update and delete: PUT, GET, POST, and DELETE. The communication protocol is
typically HTTP.

Service-oriented architecture separates the application for the services performed in the applications.
The services are stateless scripts called by the application upon necessity to access the data repository.
This allows for the services to be combined in any way the application desires accommodating any
business requirement changes that may be required by these services or the applications. This
architecture works best with a standard programming language.
A distributed database management system (DDBMS) is a software system where a single logical
database is fragmented among computers of network. Each fragment is under control of a separate
DBMS and processes requests for that data as well as sends requests for other fragmented data. The
distribution of data is invisible to all users making the system appear to be centralized.

You might also like