Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Database Management System

Dr. G. Jasmine Beulah,


Dept. Computer Science
Kristu Jayanti College,Bengaluru
DATA
• Data is a raw material that can be processed
for any computing machine
• Employee name, product name, name of a
student, marks of a student, any number,
image
Information
• It is the data that has been converted into
more useful or intelligible form.
• Eg: Report card of a student
Why we need information
• To gain knowledge about the surroundings
• To keep the system upto date
• To know the rules and regulations of the
society.
Differences-Data and Information

Data Information
Data is the raw fact It is the processed form of Data

It is not significant to business It is significant to business


Atomic level piece of information It is a collection of data
Data does not help in decision making It helps in decision making
Eg: Product name, Name of a student Eg: report card sheet
Database
• Database is a collection of related information
• Known facts that can be recorded and have
implicit meaning.
• Eg: Dictionary, Telephone Directory
Several databases:
• Traditional database applications: Most of the
information that is stored and accessed is either textual
or numeric.
• Multimedia databases: They can now store pictures,
video clips, and sound messages.
• Geographic information systems (GIS)-It can store and
analyze maps, weather data, and satellite images.
• Data warehouses and on-line analytical processing
(OLAP) systems are used in many companies to extract
and analyze useful information from very large
databases for decision making.
• Real-time and active database: This technology is used
in controlling industrial and manufacturing processes.
Implicit property of database:
• A database represents some aspect of the
real world.
• A database is a logically coherent collection of
data with some inherent meaning.
• A database is designed built & populated with
data for a specific purpose.
Database management system
• DBMS is a collection of programs that enables
users to create and maintain the database.
• It is a general purpose software system that
facilitates the process of defining, constructing &
manipulating database for various applications.
• Defining: To specify the data types, structures &
constraints of data to be stored in the database.
• Constructing: To stored the data itself on some
storage medium that is controlled by the DBMS.
• Manipulating: Querying the database to retrieve
specific data updating the database to reflect
changes in number and generating reports of
data.
Traditional File Oriented Approach
Advantages:
• No external storage required
• No highly technical person required
• Processing speed is high
Disadvantages of traditional approach
for storing data
• Data Security
 The data stored in the flat file(s) can be easily
accessible and hence it is not secure.
 Consider an online banking application where we
store the account related information of all
customers in flat files.
 A customer will have access only to his account
related details.
 However from a flat file, it is difficult to put such
constraints
Data Redundancy and Inconsistency
In this storage model, the same information
may get duplicated in two or more files.
This may lead to higher storage and access
cost. It also may lead to data inconsistency.
Suppose the same data is repeated in two or
more files.
 If a change is made to data stored in one file,
other files also needs to be change
accordingly.
• Difficulty in accessing data due to Data
Isolation
Data Isolation means that all the related data
is not available in one file.
Usually the data is scattered in various files
having different formats.
Hence writing new application programs to
retrieve the appropriate data is difficult.
• Program/Data Dependence
In traditional file approach, application
programs are closely dependent on the files in
which data is stored.
If we make any changes in the physical format
of the file(s), like addition of a data field, all
application programs needs to be changed
accordingly.
• Lack of Flexibility
The traditional systems are able to retrieve
information for predetermined requests for
data.
 If we need unanticipated data, huge
programming effort is needed to make the
information available, provided the
information is there in the files.
By the time the information is made available,
it may no longer be required or useful
• Concurrent Access Anomalies
Many traditional systems allow multiple users
to access and update the same piece of data
simultaneously.
However this concurrent updates may result
in inconsistent data.
To guard against this possibility, the system
must maintain some form of supervision.
But supervision is difficult because data may
be accessed by many different application
programs and these application programs may
not have been coordinated previously.
• Integrity problems
The data values stored in the database must
satisfy certain types of consistency
constraints.
 Developers enforce the constraints in the
system by adding appropriate code in
application programs.
Therefore adding new constraints requires the
change in the program.
• Atomicity problem
Once failure occurs and is detected the data
has to be restored to a consistent state that
existed before failure.
It is difficult to have this data consistency in
file processing system.
Database Environment
Database users
Actors on the scene
The persons whose jobs involve the day to day use of the
large database are called actors on the scene.
• Database Administrators (DBA):
 DBA is the chief administrator to oversee and manage
the resources i.e (the primary resource is the database
itself and the secondary resource is DBMS and related
software)
 He is responsible for authorizing access to the database
for coordinating and monitoring its uses for acquiring
software and hardware resources as needed.
 He is responsible for breach of security poor response
time etc.
• Database designers:
They are responsible for identifying data to be
stored in the database and for choosing
appropriate structures to represent and store
the data.
They have to study the requirements of the
various users to come up with the design that
meets the requirements.
This task is done before the database is
implements populated with data.
They also talk to the perspective users &
develop the view for different users t meet
their processing requirements.
• End users:
 End users are the people where jobs require the
access to the Database for querying, updating &
generating reports several categories of end
users.
• Casual End users:
 They occasionally access the database but they
may need different information each time.
 They use a sophisticated database query
language to specify their requests & are typically
middle of high level manage or occasional
browsers.
• Naïve or parametric End users:
They job revolves around constantly querying
and updating the database using standard
types of queries & updates called canned
transactions.
Eg. Bank tellers, reservation clerks etc.
• Sophisticated End users:
They familiarize themselves with the facilities
of the DBMS, to implement their applications
to meet their complex requirements.
Eg. engineers, scientists etc.
• Stand alone users:
 They maintain personal database by using ready
made program packages. They are easy to use
menu of graphical Interfaces.
• System analysts and Application programmers:
• System Analyst:
 They study the requirements of End users
specially having users and make the specifications
for canned transactions.
• Application programmer: They implement the
specifications as programs. i.e., test debug,
document & maintain the canned transactions.
Both together are called Software engineers.
Workers behind the scene
• These persons are typically not interested in the
database itself.
• DBMS system designers & Implementers:
 These persons designs and implement the DBMS
modules and interfaces as a software package.
 DBMS has several modules like recovery, security,
concurrency control etc.
 DBMS must interface with other system software,
such as operating system & compilers of
programming languages.
• Tool Developers:
These persons design & implement the tools
i.e. the software packages that facilitate
database system design & use and help
improve performance.
• Operators and maintenance personnel:
These are the administration personnel who
ate responsible for the actual running &
maintenance of the hardware and software
environment for the database system.
Characteristics of the database
Approach
• Self describing nature of the Database
system:
The database system contains not only the
database itself but also a complete definition
or description of the database structure and
constraints. This definition is stored in system
catalog.
The information stored in catalog is called
metadata.
• Insulation between programs and data, data
Abstraction
 DBMS Access programs do not require undergoing changes
when the structure of the data files is changed because the
structure of data file is stored separately in DBMS catalog
separately. This property is called program data
Independence.
 In object oriented database users can define operations on
data as part of database definitions. The operation has the
interface and implementation. The implementation is
specified separately and can be changed without affecting
the interface. User programs can operate on the data by
invoking these operations through the interface regardless
of how the operations are implemented. This property is
called program operation independence.
 These characteristics of program data independence and
program operation independence are called data
abstraction.
• Support of multiple views of data:
A database has many users and each of them
may require a different perspective or view of
database.
View is the subset of the database or it is
containing the virtual data derived from the
database files but is not explicitly stored.
A multiuser DBMS whose user has a variety of
application must provide facilities for defining
multiple views.
• Sharing of Data & Multiuser Transaction
Processing:
A multiuser DBMS must allow multiple users
to access the database at the same time.
This is essential if data for multiple
applications is to be integrated and
maintained in a single database.
The DBMS must include concurrency control
software to ensure that several users trying to
update the same data do so in a controlled
manner so that the result of the updates is
correct.
Implications of the database
approach
The implications of using the database approach that can
benefit most organizations are:
• Potential for Enforcing Standards
 The database approach permits the DBA to define and
enforce standards among database users in a large
organization.
 Standards can be defined for names and formats of data
elements, display formats, report structures and so on.
• Reduced Application Development Time
 The main advantage of database approach is that
developing a new application takes a very little time.
 Once a database is up and running, less time is required to
create new applications using DBMS facilities.
• Flexibility
 DBMS allows certain types of changes to the structure
of the database without affecting the stored data and
the existing application programs.
• Availability of up-to-date Information
 A DBMS makes the database available to all users.
When one user updates a database, all other users can
immediately see this update. This availability of up-to-
date information is essential for reservation systems,
banking databases etc.
• Economies of scale
 Wasteful overlap of resources and personnel can be
avoided by consolidating data and applications across
departments.
Advantages of using DBMS
• Controlling redundancy
 In DBMS approach the data required for different
users can be stored in a centralized manner
without duplicating the data multiple times.
 When the data is stored redundantly several
problems like the following arise,
 Duplication of effect as several data has to be
entered many times
 Wastage of storage space as the same data is
stored many times
 Data may become inconsistent as the updating
requires to be done separately on all files.
• Restricting unauthorized access:
 When multiple users share the database it is required
that the users be authorized and given certain
operation permissions.
 This is taken care by the authorization and security
subsystem, where the DBA creates accounts and
specifies the account restrictions.
• Providing persistent storage for program objects and
data structures:
 Database provides persistent storage for program
objects and data structures.
 This is one of the main reasons for the emergence of
object oriented database.
 The data structures provided by DBMS are compatible
with the programming language data structures
• Permitting inferencing & actions using rules:
Some data bases provide capabilities for
defining deduction rules for inferencing new
information from the stored database facts.
Such systems are called deductive database
systems.
As the miniworld rules changes the declared
deduction rules can be changed than to
recode the procedural programs.
Active database systems provide the active
rules that can automatically initiate actions
• Representing complex relationships among data:
 DBMS must have the capability to represent a
variety of complex relationships among the data
as well as to retrieve and update related data
easily & efficiently.
• Providing multiple user interfaces:
 Since person with varying levels of technical
knowledge use the database DBMS should
provide a variety of user interface.
 They include query language for naïve users
menu & natural language interfaces for stand
alone users.
• Enforcing integrity constraints:
DBMS must provide capabilities for defining
and enforcing the constraints (rules).
Constraints include specifying the data type,
uniqueness of data item etc.
• Providing backup & recovery:
DBMS must provide facilities for recovering
from hardware & software failures.
The back up and recovery subsystem is
responsible for recovery.
Disadvantages of DBMS
• Complexity :
 The provision of the functionality that is expected
of a good DBMS makes the DBMS an extremely
complex piece of software.
 Database designers, developers, database
administrators and end-users must understand
this functionality to take full advantage of it.
 Failure to understand the system can lead to bad
design decisions, which can have serious
consequences for an organization.
• Size :
 The complexity and breadth of functionality
makes the DBMS an extremely large piece of
software, occupying many megabytes of disk
space and requiring substantial amounts of
memory to run efficiently.
• 3. Performance:
 A File Based system is written for a specific
application and as a result, performance is
generally very good.
 However, the DBMS is written to be more
general, to cater for many applications rather
than just one.
 The effect is that some applications may not run
as fast as they used to.
• Higher impact of a failure:
The centralization of resources increases the
vulnerability of the system.
Since all users and applications rely on the
availabi1ity of the DBMS, the failure of any
component can bring operations to a halt.
• Cost of DBMS:
The cost of DBMS varies significantly,
depending on the environment and
functionality provided.
 Additional Hardware costs:
 The disk storage requirements for the DBMS and the
database may necessitate the purchase of additional
storage space.
 Furthermore, to achieve the required performance it
may be necessary to purchase a larger machine,
perhaps even a machine dedicated to running the
DBMS.
 The procurement of additional hardware results in
further expenditure.
• Cost of Conversion:
 The cost of converting existing applications to run on
the new DBMS and hardware is more.
 This cost also includes the cost of training staff to use
these new systems and possibly the employment of
specialist staff to help with conversion and running of
the system
When not to use a DBMS
• In spite of the advantages of using a DBMS, there are few
situations in which there are overhead costs such as:
 High initial investment in hardware, software and training.
 Overhead for providing generality, security, concurrency
control, recovery, and integrity functions.
• A DBMS may be unnecessary if,
 The database and applications are simple, well defined and
not expected to change.
 There are stringent real-time requirements that may not be
met because of DBMS overhead.
 Access to data by multiple users is not required.
• When no DBMs may suffice:
 If database system is not able to handle the complexity of
data because of modeling limitations.
 If the database users need operations not supported by
DBMS
Questions:
1. Define the terms:
• i. Database
• ii. DBMS
• iii. Meta data
• iv. System catalog
• v. Data
• vi. Information
2. Explain the characteristics of database approach.
3. Explain the advantages and disadvantages of traditional file oriented
approach.
4. Explain the advantages and disadvantages of database approach.
5. Expand DBA. What is the role of DBA?
6. Explain the implications of Database approach.
7. What are the circumstances when DBMS should not be used?
8. Write short notes on the users of the Database.
DATABASE SYSTEM CONCEPTS AND
ARCHITECTURE
• Data models
DBMS allows a user to specify the data to be stored in
terms of a data model.
 A data model is a collection of higher level data
description constraints that hides lower level storage
details.
 It is the collection of concepts that can be used to
describe the structure of a database. (Structure means
the data types, relationships & constraints that are held
on the data).
• Data models are of three types:
1. Object based / High-level data models
2. Record based / Representational data models
3. Physical data model / low-level data models
• High Level or Conceptual Data Model:
 It provides the concepts that are close to the way many
users perceive the data. Conceptual Data Model uses the
concepts like entity, attributes, relationship.
 Entity: It is the real world object.
 Attribute: property of the entity.
 Relationship: Interaction between entities.
• Representational or Implementation Data Model:
 It provides the concepts understood by the end users and
not far from the way data is organized on the computer. It
includes the relational model, hierarchical and network
model.
 There are 3 types of record based data model.
 They are:
i. Hierarchical data model.
ii. Network data model.
iii. Relational data model.
i. Hierarchical data model
In this model each entity has only one parent
but can have several children .
At the top of hierarchy there is only one entity
which is called Root.

The data is sorted hierarchically, using a downward tree. This model


uses pointers to navigate between stored data. It was the first DBMS
model.
ii. Network data model
In the network model, entities are organised in
a graph,in which some entities can be accessed
through several path

Like the hierarchical model, this model uses pointers toward


stored data. However, it does not necessarily use a downward
tree structure.
iii. Relational data model (RDBMS, Relational
database management system)
In this model, data is organised in two-
dimesional tables called relations. The tables
or relation are related to each other.

The data is stored in two-dimensional tables


(rows and columns). The data is manipulated
based on the relational theory of mathematics.
• Low Level or Physical Data Model:
It provides the concepts that describe the
details of how data is stored in the computer.
Physical Data Model describes the storage of
data in the computer by representing
information such as record formats, record
orderings and access path.
Concepts provided by these models are
generally meant for computer specialists, not
for typical end users.

You might also like