Download as pdf or txt
Download as pdf or txt
You are on page 1of 76

Page |1

Data Base System


Course Outline:

1. Databases Overview: Basic Concept; File Processing & Database Approach, Database Applications,

Advantages of the DB, Components of the DB Environment, and Evolution of DBs.

2. Database Architecture: DB Development Process, Three Schema Architecture, Data Modeling, E-R Modeling

(Basic Concepts)

3. Logical Design: E-R Modeling (Entities, Attributes, Relationships; Cardinality Constraints), RDBMS: Logical

View of Data; The Relational Data Model

4. Logical Design:Constraints, Transforming ERD/EERD into Relations

5. The Relational Model: Types, Relations, Relational Algebra, Relational Calculus, Integrity

6. Normalization: First Normal Form, Second Normal Form

7. Normalization: Third Normal Form (3NF), Boyce Codd Normal Form (BCNF)

8. EE-R Diagrams: Development & Constraints, DB Design Life Cycle,

9. DB Development & Management: Introduction to SQL and Basic Commands, SQL Integrity Constraints.

10. Physical DB Design, DB architecture, Query Optimization

11. SQL Commands: Saving, Listing, Editing, Restoring Table Contents; Logical Operators, Management

Commands

12. Arithmetic Operators, Complex Queries and SQL Functions, Aggregate Function, Grouping Functions 13.

Virtual Tables, Views, Indexes, Joins

14. Clint-Server & Distributed Environment, ODBC, Bridges, and Connectivity Issues.

15. Concurrency Control with Locking, Serializability, Deadlocks, Database Recovery Management.

16. Distributed Processing and Distributed Databases,DDBMS: Evolution, Architecture, Components,

Advantages, Security and Authorization. Physical Design: Storage and File Structure, Efficiency And Tuning

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


Page |2

Data Base over view and Basic Concepts:

What is Data?
Data is nothing but facts and statistics stored or free flowing over a network, generally it's raw and
unprocessed.

Data becomes information when it is processed, turning it into something meaningful.

Database:

Database is a collection of related data and data is a collection of facts and figures that can be processed to
produce information.

A Database is a collection of related data organized in a way that data can be easily accessed, managed and
updated.

Database can be software based or hardware based, with one sole purpose, storing data.
During early computer days, data was collected and stored on tapes, which were mostly write-only, which means
once data is stored on it, it can never be read again. They were slow and bulky, and soon computer scientists
realized that they needed a better solution to this problem.

What is DBMS?
A DBMS is software that allows creation, definition and manipulation of database, allowing users to store, process
and analyze data easily.
DBMS provides us with an interface or a tool, to perform various operations like creating database, storing data in
it, updating data, creating tables in the database and a lot more.

DBMS also provides protection and security to the databases. It also maintains data consistency in case of multiple
users.
Here are some examples of popular DBMS used these days:
Data Base Application:

 MySql
 Oracle
 SQL Server
 IBM DB2
 PostgreSQL
 Amazon SimpleDB (cloud based) etc.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


Page |3

Characteristics:
Traditionally, data was organized in file formats. DBMS was a new concept then, and all the research was done to
make it overcome the deficiencies in traditional style of data management. A modern DBMS has the following
characteristics –

1. Real-world entity
2. Relation-based tables
3. Isolation of data and application
4. Less redundancy
5. Consistency
6. Query Language
7. ACID Properties
8. Multiuser and Concurrent Access
9. Multiple views
10. Security

File Processing System vs Database Approach


File Processing System
In the previous, many organizations completely used file processing methods to the retailer and handle
knowledge. In a typical file processing system, every division or space inside a corporation has its personal set
of information. The information in a single file might not relate to the information in every other
file. Organizations have used file processing methods for a few years. Many of those methods, nonetheless,
have two main weaknesses: they’ve redundant knowledge and they isolate knowledge.

Database Approach
When an organization uses the database approach, many programs and users share the data in
the database. A school’s database most likely at a minimum contains data about students,
instructors, schedule of classes, and student schedules. As shown in the above image, various
areas within the school share and interact with the data in this database. The database does
secure its data, however, so that only authorized users can access certain data items. While a user
is working with the database, the DBMS resides in the memory of the computer.

Advantages of DBMS
 Segregation of application program.
 Minimal data duplicity or data redundancy.
 Easy retrieval of data using the Query Language.
 Reduced development time and maintenance need.
 With Cloud Datacenters, we now have Database Management Systems capable of storing almost infinite data.
 Seamless integration into the application programming languages which makes it very easier to add a database
to almost any application or website.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


Page |4

Components of the DB Environment:


The database management system can be divided into five major components, they are:

1. Hardware
2. Software
3. Data
4. Procedures
5. Database Access Language

Evolution of DB.

Database Architecture:
A Database Management system is not always directly available for users and applications to access and store data
in it. A Database Management system can be
Centralized (all the data stored at one location),
decentralized(multiple copies of database at different locations)
or hierarchical, depending upon its architecture.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


Page |5

1-tier DBMS architecture also exist, this is when the database is directly available to the user for using it to store
data. Generally such a setup is used for local application development, where programmers communicate directly
with the database for quick response.

Database Architecture is logically of two types:

1. 2-tier DBMS architecture


2. 3-tier DBMS architecture

2-tier DBMS Architecture


2-tier DBMS architecture includes an Application layer
between the user and the DBMS, which is responsible
to communicate the user's request to the database management
system and then send the response from the DBMS to the user.

An application interface known as ODBC(Open Database Connectivity)


provides an API that allow client side program to call the DBMS.
Such an architecture provides the DBMS extra security as it is not exposed
to the End User directly.

3-tier DBMS Architecture


3-tier DBMS architecture is the most commonly used architecture for
webapplications.In 3-tier architecture, an additional Presentation or GUI Layer
is added, which provides a graphical user interface for the End user to interact
with the DBMS.
For the end user, the GUI layer is the Database System, and the end user
has no idea about the application layer and the DBMS system.
If you have used MySQL, then you must have seen PHPMyAdmin,
it is the best example of a 3-tier DBMS architecture.

Three Schema Architecture


A database schema defines its entities and the relationship among them.

It’s the database designers who design the schema to help programmers

understand the database and make it useful.

Databases are characterized by three-schema architecture

because there are three different ways to look at them. Each schema

is important to different groups in an organization.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


Page |6

1) User view,
2) logical schema,
3) physical schema.

1) User views
User views specify which users are permitted access to what data in a database.
For example, an employee database might contain employee names, addresses, and phone numbers.

2) Logical Schema
A database's logical schema is its overall logical plan. This schema is developed with diagrams that define the
content of database tables and describe how the tables are linked together for data access.

3) Physical Schema
The physical schema of a database refers to how data is stored on the computer on which it resides.

Data Modeling:
Data models define how the logical structure of a database is modeled. Data Models are fundamental entities
to introduce abstraction in a DBMS. Data models define how data is connected to each other and how they
are processed and stored inside the system.
The Relational Model is the most widely used database model, there are other models too:

 Hierarchical Model
 Network Model
 Entity-relationship Model
 Relational Model

Hierarchical Model
This database model organizes data into a tree-like-structure, with a single root, to which all the other data is
linked. The hierarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes.
In this model, a child node will only have a single parent node.
This model efficiently describes many real-world relationships like index of a book, recipes etc.
In hierarchical model, data is organized into tree-like structure with one one-to-many relationship between two
different types of data, for example, one department can have many courses, many professors and of-course many
students.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


Page |7

Network Model
This is an extension of the Hierarchical model. In this model data is organized more like a graph, and are allowed to
have more than one parent node.
In this database model data is more related as more relationships are established in this database model. Also, as
the data is more related, hence accessing the data is also easier and fast. This database model was used to map
many-to-many data relationships.
This was the most widely used database model, before Relational Model was introduced.

Entity-Relationship Model:
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships among them.
While formulating real-world scenario into the database model, the ER Model creates entity set, relationship
set, general attributes, and constraints.
In this database model, relationships are created by dividing object of interest into entity and its characteristics
into attributes.
Different entities are related using relationships.

ER Model is best used for the conceptual design of a database.

ER Model is based on −

 Entities and their attributes.

 Relationships among entities.

 Entity − An entity in an ER Model is a real-world entity having properties called attributes. Every attribute is defined
by its set of values called domain. For example, in a school database, a student is considered as an entity. Student
has various attributes like name, age, class, etc.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


Page |8

 Relationship − The logical association among entities is called relationship. Relationships are mapped with entities in
various ways. Mapping cardinalities define the number of association between two entities.

Mapping cardinalities −

o one to one

o one to many

o many to one

o many to many

Relational Model:
The most popular data model in DBMS is the Relational Model.
It is more scientific a model than others.In this model, data is
organized in two-dimensional tables and the relationship is
maintained by storing a common field.The basic structure of
data in the relational model is tables. All the information related
to a particular type is stored in rows of that table.
Hence, tables are also known as relations in relational model.

ER MODEL – BASIC CONCEPTS:

The ER model defines the conceptual view of a database. It works around realworld entities and the
associations among them. At view level, the ER model is considered a good option for designing databases.

Entity:
An entity can be a real-world objectthat can be easily identifiable. For example, in a school database, students,
teachers, classes, and courses offered can be considered as entities. All these entities have some attributes or
properties that give them their identity.
An entity set is a collection of similar types of entities. An entity set may contain entities with attribute sharing
similar values. For example, a Students set may contain all the students of a school; likewise a Teachers set
may contain all the teachers of a school from all faculties. Entity sets need not be disjoint.

Attributes:
Entities are represented by means of their properties called attributes. All attributes have values. For example,
a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a student's name
cannot be a numeric value. It has to be alphabetic. A student's age cannot be negative, etc.
Types of Attributes:
Simple attribute: Simple attributes are atomic values, which cannot be divided further. For example, a
student's phone number is an atomic value of 10 digits.
Composite attribute:
Composite attributes are made of more than one simple attribute. For example, a
student's complete name may have first name and last-named.
Derived attribute:
Derived attributes are the attributes that do not exist in the physical database, but their
values are derived from other attributes present in the database. For example, average

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


Page |9

salary in a department should not be saved directly in the database, instead it can be
derived. For another example, age can be derived from data_of_birth.

Single-value attribute:
Single-value attributes contain single value.
For example: Social_Security_Number.

Multi-value attributes:
Multi-value attributesmay contain more than one value. For example, a person can have
more than one phone number, email_address, etc.

Entity-Set and Keys:


Key is an attribute or collection of attributes that uniquely identifies an entity among entity set. For example,
the roll_number of a student makes him/her identifiable among students.
Super Key:
A set of attributes (one or more) that collectively identifies an entity in an entity set.
Candidate Key:
A minimal super key is called a candidate key. An entity set may have more than one candidate key.
Primary Key:
A primary key is one of the candidate keys chosen by the database designer to uniquely identify theentity set.
Relationship:
The association among entities is called a relationship.
For example,an employee works_at a department, a student enrolls in a course. Here, Works_at and Enrolls
are called relationships.

Relationship Set :
A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have
attributes. These attributes are called descriptive attributes.
Degree of Relationship:
The number of participating entities in a relationship defines the degree of the relationship.

Mapping Cardinalities:

Cardinality defines the number of entities in one entity set, which can be associated with the number of
entities of other set via relationship set.
One-to-one:
One entity from entity set
A can be associated with at
most one entity of entity
set B and vice versa.

One-to-many:
One entity from entity set
A can be associated with more
than one entities of entity set B,
however an entity from entity set
B can be associated with at most one entity.

Many-to-one:
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 10

More than one entities from entity set


A can be associated with at most one entity
of entity set B, however an entity from entity
set B can be associated with more than one entity
from entity set A.

Many-to-many:
One entity from A can be associated
with more than one entity from B and vice versa.

Basic Relational DBMS Concepts


A Relational Database management System(RDBMS) is a database management system based on the relational
model introduced by E.F Codd. In relational model, data is stored in relations(tables) and is represented in form
of tuples(rows).

Concepts
Tables − In relational data model, relations are saved in the format of Tables. This format stores the relation
among entities. A table has rows and columns, where rows represents records and columns represent the
attributes.

Tuple − A single row of a table, which contains a single record for that relation is called a tuple.

Relation instance − A finite set of tuples in the relational database system represents relation instance.
Relation instances do not have duplicate tuples.

Relation schema − A relation schema describes the relation name (table name), attributes, and their names.

Relation key − Each row has one or more attributes, known as relation key, which can identify the row in the
relation (table) uniquely.

Attribute domain − Every attribute has some pre-defined value scope, known as attribute domain.

Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions are
called Relational Integrity Constraints. There are three main integrity constraints −

 Key constraints

 Domain constraints

 Referential integrity constraints

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 11

Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely.
This minimal subset of attributes is called keyfor that relation. If there is more than one such minimal subset,
these are called candidate keys.

Key constraints force that −

 In a relation with a key attribute, no two tuples can have identical values for key attributes.

 A key attribute cannot have NULL values.

Key constraints are also referred to as Entity Constraints.

Domain Constraints
Attributes have specific values in real-world scenario. For example, age can only be a positive integer. The
same constraints have been tried to employ on the attributes of a relation. Every attribute is bound to have a
specific range of values. For example, age cannot be less than zero and telephone numbers cannot contain a
digit outside 0-9.

Referential integrity Constraints


Referential integrity constraints work on the concept of Foreign Keys. A foreign key is a key attribute of a
relation that can be referred in other relation.

Referential integrity constraint states that if a relation refers to a key attribute of a different or same relation,
then that key element must exist.

What is Relational Algebra?


Every database management system must define a query language to allow users to access the data stored in the
database. Relational Algebra is a procedural query language used to query the database tables to access data in
different ways.
In relational algebra, input is a relation (table from which data has to be accessed) and output is also a relation (a
temporary table holding the data asked for by the user).

The primary operations that we can perform using relational algebra are:

1. Select
2. Project
3. Union
4. Set Different
5. Cartesian product
6. Rename

Select Operation (σ)


This is used to fetch rows (tuples) from table (relation) which satisfies a given condition.

Syntax: σp(r)

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 12

σ
Where, represents the Select Predicate, r is the name of relation (table name in which you want to look for
data), and p is the prepositional logic, where we specify the conditions that must be satisfied by the data. In
prepositional logic, one can use unary and binary operators like =, <, > etc., to specify the conditions.
Let's take an example of the Student table we specified above in the Introduction of relational algebra, and fetch
data for students with age more than 17.

σage> 17 (Student)
This will fetch the tuples (rows) from table Student, for which age will be greater than 17.
You can also use, and, or etc. operators, to specify two conditions, for example,

σage> 17 and gender = 'Male' (Student)


This will return tuples (rows) from table Student with information of male students, of age more than 17. (Consider
the Student table has an attribute Gender too.)

Project Operation (∏)


Project operation is used to project only a certain set of attributes of a relation. In simple words, If you want to see
only the names all of the students in the Student table, then you can use Project Operation.
It will only project or shows the columns or attributes asked for, and will also remove duplicate data from the
columns.

Syntax: ∏A1, A2...(r)


where A1, A2 etc. are attribute names(column names).
For example,

∏Name, Age(Student)
Above statement will show us only the Name and Age columns for all the rows of data in Student table.

Union Operation (∪)


This operation is used to fetch data from two relations (tables) or temporary relation (result of another operation).
For this operation to work, the relations (tables) specified should have same number of attributes (columns) and
same attribute domain. Also the duplicate tuples are automatically eliminated from the result.
Syntax: A ∪ B
WhereA and B are relations.
For example, if we have two tables Regular Class and Extra Class, both have a column student to save name of
student, then,
∏Student(Regular Class) ∪ ∏Student(Extra Class)
Above operation will give us name of Students who are attending both regular classes and extra classes,
eliminating repetition.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 13

Set Difference (-)


This operation is used to find data present in one relation and not present in the second relation. This operation is
also applicable on two relations, just like Union operation.
Syntax: A - B
where A and B are relations.
For example, if we want to find name of students who attend the regular class but not the extra class, then, we can
use the below operation:
∏Student(RegularClass) - ∏Student(ExtraClass)

Cartesian product (X)


This is used to combine data from two different relations (tables) into one and fetch data from the combined
relation.
Syntax: A X B
For example, if we want to find the information for Regular Class and Extra Class which are conducted during
morning, then, we can use the following operation:

σtime = 'morning' (RegularClass X ExtraClass)


For the above query to work, both Regular Class and Extra Class should have the attribute time.

Rename Operation (ρ)


This operation is used to rename the output relation for any query operation which returns result like Select,
Project etc. Or to simply rename a relation (table)
Syntax: ρ(RelationNew, RelationOld)
Apart from these common operations Relational Algebra is also used for Join operations like,

 Natural Join
 Outer Join
 Theta join etc

Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language, that is, it tells what
to do but never explains how to do it.

Relational calculus exists in two forms −

Tuple Relational Calculus (TRC)


Filtering variable ranges over tuples

Notation − {T | Condition}

Returns all tuples T that satisfies a condition.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 14

For example −
{ T.name | Author(T) AND T.article = 'database' }

Output − Returns tuples with 'name' from Author who has written article on 'database'.

TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀).

For example −
{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}

Output − The above query will yield the same result as the previous one.

Domain Relational Calculus (DRC)


In DRC, the filtering variable uses the domain of attributes instead of entire tuple values (as done in TRC,
mentioned above).

Notation −

{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}

Where a1, a2 are attributes and P stands for formulae built by inner attributes.

For example −
{< article, page, subject >| ∈TutorialsPoint∧ subject = 'database'}

Output − Yields Article, Page, and Subject from the relation TutorialsPoint, where subject is database.

Just like TRC, DRC can also be written using existential and universal quantifiers. DRC also involves relational
operators.

The expression power of Tuple Relation Calculus and Domain Relation Calculus is equivalent to Relational
Algebra.

ER Model to Relational Model


As we all know that ER Model can be represented using ER Diagrams which is a great way of designing and
representing the database design in more of a flow chart form.
It is very convenient to design the database using the ER Model by creating an ER diagram and later on converting
it into relational model to design your tables.
Not all the ER Model constraints and components can be directly transformed into relational model, but an
approximate schema can be derived.
So let's take a few examples of ER diagrams and convert it into relational model schema, hence creating tables in
RDBMS.

Entity becomes Table


Entity in ER Model is changed into tables, or we can say for every
Entity in ER model, a table is created in Relational Model.And
the attributes of the Entity gets converted to columns of the table.
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 15

And the primary key specified for the entity in the ER model, will
become the primary key for the table in relational model.
For example, for the below ER Diagram in ER Model,

A table with name Student will be created in relational model, which will have 4
columns, id, name, age, address and id will be the primary key for this table.

Relationship becomes a Relationship Table


In ER diagram, we use diamond/rhombus to represent a
relationship between two entities. In Relational model
we create a relationship table for ER Model relationships too.
In the ER diagram below, we have two entities Teacher and
Student with a relationship between them.

As discuss above, entity gets mapped to table, hence we will create table for Teacher and a table for Student with
all the attributes converted into columns.
Now, an additional table will be created for the relationship, for example StudentTeacher or give it any name you
like. This table will hold the primary key for both Student and Teacher, in a tuple to describe the relationship, which
teacher teaches which student.
If there are additional attributes related to this relationship, then they become the columns for this table, like
subject name.
Also proper foreign key constraints must be set for all the tables.

Points to Remember
Similarly we can generate relational database schema using the ER diagram. Following are some key points to keep
in mind while doing so:

1. Entity gets converted into Table, with all the attributes becoming fields (columns) in the table.
2. Relationship between entities is also converted into table with primary keys of the related entities also stored
in it as foreign keys.
3. Primary Keys should be properly set.
4. For any relationship of Weak Entity, if primary key of any other entity is included in a table, foreign key
constraint must be defined.

Normalization of Database:
Database Normalization is a technique of organizing the data in the database. Normalization is a systematic
approach of decomposing tables to eliminate data redundancy(repetition) and undesirable characteristics like
Insertion, Update and Deletion Anomalies. It is a multi-step process that puts data into tabular form, removing
duplicated data from the relation tables.

Normalization is used for mainly two purposes,

1. Eliminating redundant(useless) data.


Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 16

2. Ensuring data dependencies make sense i.e data is logically stored.

Problems without Normalization:

If a table is not properly normalized and have data redundancy then it will not only eat up extra memory space but
will also make it difficult to handle and update the database, without facing data loss. Insertion, Updation and
Deletion Anomalies are very frequent if database is not normalized. To understand these anomalies let us take an
example of a Student table.

rollno name branch hod office_tel

401 Abdullah CSE Mr. X 53337

402 Abdul CSE Mr. X 53337

403 Wasim CSE Mr. X 53337

404 Malik CSE Mr. X 53337

In the table above, we have data of 4 Computer Sci. students. As we can see, data for the fields branch, hod(Head of
Department) and office_tel is repeated for the students who are in the same branch in the college, this is Data
Redundancy.

Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch, data of the student cannot be inserted,
or else we will have to set the branch information as NULL.
Also, if we have to insert data of 100 students of same branch, then the branch information will be repeated for all
those 100 students.
These scenarios are nothing but Insertion anomalies.
Updation Anomaly
What if Mr. X leaves the college? or is no longer the HOD of computer science department? In that case all the
student records will have to be updated, and if by mistake we miss any record, it will lead to data inconsistency.
This is Updation anomaly.
Deletion Anomaly
In our Student table, two different information’s are kept together, Student information and Branch information.
Hence, at the end of the academic year, if student records are deleted, we will also lose the branch information.
This is Deletion anomaly.

First Normal Form(FNF)


First Normal Form is defined in

the definition of relations (tables) itself.

This rule definesthat all the attributes

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 17

in a relation must haveatomic domains.

The values in an atomic domainare indivisible units.

We re-arrange

the relation (table)

as below, to convert it

to First Normal Form.

Each attribute must contain only a single value from its pre-defined domain.

Second Normal Form(SNF)


Before we learn about the second normal form, we need to understand the following −

 Prime attribute − An attribute, which is a part of the candidate-key, is known as a prime attribute.

 Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be a non-prime attribute.

If we follow second normal form, then every non-prime attribute should be fully functionally dependent on
prime key attribute. That is, if X →A holds, then there should not be any proper subset Y of X, for which

Y → A also holds true.

We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID. According to
the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon both and not on any of
the prime key attribute individually. But we find that Stu_Name can be identified by Stu_ID and Proj_Name
can be identified by Proj_ID independently. This is called partial dependency, which is not allowed in Second
Normal Form.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 18

We broke the relation in two as depicted in the above picture. So there exists no partial dependency.

Third Normal Form(TNF)


For a relation to be in Third Normal Form, it must be in Second Normal form and the following must satisfy −

 No non-prime attribute is transitively dependent on prime key attribute.

 For any non-trivial functional dependency, X → A, then either −

o X is a superkey or,
o A is prime attribute.

We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute. We find
that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a superkey nor is City a prime attribute.
Additionally, Stu_ID→ Zip → City, so there exists transitive dependency.

To bring this relation into third normal form, we break the relation into two relations as follows –

Boyce-Codd Normal Form(BCNF)


Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms. BCNF states that −

 For any non-trivial functional dependency, X → A, X must be a super-key.

In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip is the super-key in the
relation ZipCodes. So,

Stu_ID→Stu_Name, Zip

and

Zip → City

Which confirms that both the relations are in BCNF.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 19

Designing Databases/Data Base Life Cycle:

A database is usually a fundamental component of the information system, especially in business oriented
systems. Thus database design is part of system development. The following picture shows how database
design is involved in the system development lifecycle.

The phases in the middle of the picture (Database Design, Database Implementation) are the phases that you
concentrate on in the Database Design course. The other phases are briefly described. They are part of the
contents of the Systems Analysis and Design courses, for example.

There are various methods of how the different phases of information system design, analysis and
implementation can be done. Here the main tasks or goals are described but no method is introduced.

Database Planning

The database planning includes the activities that allow the stages of the database system development
lifecycle to be realized as efficiently and effectively as possible. This phase must be integrated with the overall
Information System strategy of the organization.

The very first step in database planning is to define the mission statement and objectives for the database
system. That is the definition of:
- the major aims of the database system
- the purpose of the database system
- the supported tasks of the database system
- the resources of the database system

Systems Definition

In the systems definition phase, the scope and boundaries of the database application are described. This
description includes:
- links with the other information systems of the organization
- what the planned system is going to do now and in the future
- who the users are now and in the future.

The major user views are also described. i.e. what is required of a database system from the perspectives of
particular job roles or enterprise application areas.

Requirements Collection and Analysis

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 20

During the requirements collection and analysis phase, the collection and analysis of the information about
the part of the enterprise to be served by the database are completed. The results may include eg:
- the description of the data used or generated
- the details how the data is to be used or generated
- any additional requirements for the new database system

Database Design

The database design phase is divided into three steps:


- conceptual database design
- logical database design
- physical database design

In the conceptual database design phase, the model of the data to be used independent of all physical
considerations is to be constructed. The model is based on the requirements specification of the system.

In the logical database design phase, the model of the data to be used is based on a specific data model, but
independent of a particular database management system is constructed. This is based on the target data
model for the database e.g. relational data model.

In the physical database design phase, the description of the implementation of the database on secondary
storage is created. The base relations, indexes, integrity constraints, security, etc. are defined using the SQL
language.

Database Management System Selection

This in an optional phase. When there is a need for a new database management system (DBMS), this phase is
done. DBMS means a database system like Access, SQL Server, MySQL, Oracle.

In this phase the criteria for the new DBMS are defined. Then several products are evaluated according to the
criteria. Finally the
recommendation for the selection is decided.

Application Design

In the application design phase, the design of the user interface and the application programs that use and
process the database are defined and designed.

Protyping

The purpose of a prototype is to allow the users to use the prototype to identify the features of the system
using the computer.

There are horizontal and vertical prototypes. A horizontal prototype has many features (e.g. user interfaces)
but they are not working. A vertical prototype has very few features but they are working. See the following
picture.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 21

Implementation

During the implementation phase, the physical realization of the database and application designs are to be
done. This is the programming phase of the systems development.

Data Conversion and Loading

This phase is needed when a new database is replacing an old system. During this phase the existing data will
be transferred into the new database.

Testing

Before the new system is going to live, it should be thoroughly tested. The goal of testing is to find errors! The
goal is not to prove the software is working well.

Operational Maintenance

The operational maintenance is the process of monitoring and maintaining the database system.

Monitoring means that the performance of the system is observed. If the performance of the system falls
below an acceptable level, tuning or reorganization of the database may be required.

Maintaining and upgrading the database system means that, when new requirements arise, the new
development lifecycle will be done.

Physical Database Design


Physical database design translates the logical data model into a set of SQL statements that define the database. For
relational database systems, it is relatively easy to translate from a logical data model into a physical database.

Rules for translation:

 Entities become tables in the physical database.

 Attributes become columns in the physical database. Choose an appropriate data type for each of the columns.

 Unique identifiers become columns that are not allowed to have NULL values. These are referred to as primary
keys in the physical database. Consider creating a unique index on the identifiers to enforce uniqueness.

 Relationships are modeled as foreign keys.

Spaces are not allowed in entity names in a physical schema because these names must translate into SQL calls to create
the tables. Table names should therefore conform to SQL naming rules.

Because primary key attributes are complete inventions, they can be of any indexable data type. (Each database engine
has different rules about which data types can be indexable.) Making primary keys of type INT is almost purely arbitrary.

It is almost arbitrary because it is actually faster to search on numeric fields in many database engines. However, one
could just have well have chosen CHAR as the type for the primary key fields. The bottom line is that this choice should
be driven by the criteria for choosing identifiers.

Physical Table Definitions


Table Column Data Type Notes
CD CDId INT Primary Key
CDTitle TEXT(50)
Artist ArtistId INT Primary Key
ArtistName TEXT(50)

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 22

Song SongId INT Primary Key


SongName TEXT(50)
RecordLabel RecordLabelId INT Primary Key
RecordLabelName TEXT(50)
Model relationships by adding a foreign key to one of the tables involved in the relationship. A foreign key is the unique
identifier or primary key of the table on the other side of the relationship.

The most common relationship is the 1-to-M relationship. This relationship is mapped by placing the primary key on the
"one" side of the relationship into the table on the "many" side.

1-to-1 relationships should be mapped by picking one of the tables and giving it a foreign key column that matches the
primary key from the other table. In theory, it does not matter which table is chosen, but practical considerations may
dictate which column makes the most sense as a foreign key.

Physical Data Model


Table Column Data Type Notes
CD CDId INT Primary Key
CDTitle TEXT(50) Foreign Key
RecordLabelId INT
Artist ArtistId INT Primary Key
ArtistName TEXT(50)
Song SongId INT Primary Key
SongName TEXT(50) Foreign Key
CDId INT Foreign Key
ArtistID INT
RecordLabel RecordLabelId INT Primary Key
RecordLabelName TEXT(50)
The last remaining task is to translate the complete physical database schema into SQL. For each table in the schema,
write one CREATE table statement. Typically, designers create unique indices on the primary keys to enforce uniqueness.

CREATE table CD (CDId INT NOT NULL, RecordLabelId INT, CDTitle TEXT, PRIMARY KEY (CDId))
CREATE table Artist (ArtistId INT NOT NULL, ArtistName TEXT, PRIMARY KEY (ArtistId))
CREATE table Song (SongId INT NOT NULL, CDId INT, SongName TEXT, PRIMARY KEY (SongId))
CREATE table RecordLabel (RecordLabelId INT NOT NULL, RecordLabelName TEXT, PRIMARY KEY (RecordLabelId))
Example script to create the database in MySQL

Data models are meant to be database independent. These techniques and data models may therefore be applied not
only to MySQL, but also to Oracle, Sybase, Ingres or any other relational database engine.

Query optimization:
A query optimizer is a critical database management system (DBMS) component that analyzes Structured
Query Language (SQL) queries and determines efficient execution mechanisms. A query optimizer generates
one or more query plans for each query, each of which may be a mechanism used to run a query. The most
efficient query plan is selected and used to run the query.
Sql Statements are used to retrieve data from the database. We can get same results by writing different sql
queries. But use of the best query is important when performance is considered. So you need to sql query
tuning based on the requirement. Here is the list of queries which we use reqularly and how these sql queries
can be optimized for better performance.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 23

Example:
The sql query becomes faster if you use the actual columns names in SELECT statement instead
of than '*'.
For Example: Write the query as
SELECT id, first_name, last_name, age, subject FROM student_details;

Instead of:
SELECT * FROM student_details;

Clint-Server & Distributed Environment:


The client-server model describes how a server provides resources and services to one or more clients. Examples of
servers include web servers, mail servers, and file servers. Each of these servers provide resources to client devices, such
as desktop computers, laptops, tablets, and smartphones. Most servers have a one-to-many relationship with clients,
meaning a single server can provide resources to multiple clients at one time.
When a client requests a connection to a server, the server can either accept or reject the connection. If the connection
is accepted, the server establishes and maintains a connection with the client over a specific protocol. For example,
an email client may request an SMTP connection to a mail server in order to send a message. The SMTP application on
the mail server will then request authentication from the client, such as the email address and password. If these
credentials match an account on the mail server, the server will send the email to the intended recipient.
Online multiplayer gaming also uses the client-server model. One example is Blizzard's Battle.net service, which hosts
online games for World of Warcraft, StarCraft, Overwatch, and others. When players open a Blizzard application, the
game client automatically connects to a Battle.net server. Once players log in to Battle.net, they can see who else
is online, chat with other players, and play matches with or against other gamers.
While Internet servers typically provide connections to multiple clients at a time, each physical machine can only handle
so much traffic. Therefore, popular online services distribute clients across multiple physical servers, using a technique
called distributed computing. In most cases, it does not matter which specific machine users are connected to, since the
servers all provide the same service.

ODBC

Stands for "Open Database Connectivity." With all the different types of databases available, such as Microsoft Access,
Filemaker, and MySQL, it is important to have a standard way of transferring data to and from each kind of database. For
this reason, the SQL Access group created the ODBC standard back in 1992. Any application that supports ODBC can
access information from an ODBC-compatible database, regardless of what database management system the database
uses.
For a database to be ODBC-compatible, it must include an ODBC database driver. This allows other applications to
connect to and access information from the database with a standard set of commands. The driver translates standard
ODBC commands into commands understood by the database's proprietary system. Thanks to ODBC, a single application
(such as Web server program) can access information from several different databases using the same set of commands.

DBMS - Concurrency Control


In a multiprogramming environment where multiple transactions can be executed simultaneously, it is highly important
to control the concurrency of transactions. We have concurrency control protocols to ensure atomicity, isolation, and
serializability of concurrent transactions. Concurrency control protocols can be broadly divided into two categories −

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 24

 Lock based protocols


 Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any transaction cannot read or write
data until it acquires an appropriate lock on it. Locks are of two kinds −

 Binary Locks − A lock on a data item can be in two states; it is either locked or unlocked.

 Shared/exclusive − This type of locking mechanism differentiates the locks based on their uses. If a lock is
acquired on a data item to perform a write operation, it is an exclusive lock. Allowing more than one
transaction to write on the same data item would lead the database into an inconsistent state. Read locks are
shared because no data value is being changed.

There are four types of lock protocols available −

Simplistic Lock Protocol:


Simplistic lock-based protocols allow transactions to obtain a lock on every object before a 'write' operation is
performed. Transactions may unlock the data item after completing the ‘write’ operation.

Pre-claiming Lock Protocol


Pre-claiming protocols evaluate their operations and create a list of data items on which they need locks. Before
initiating an execution, the transaction requests the system for all the locks it needs beforehand. If all the locks are
granted, the transaction executes and releases all the locks when all its operations are over. If all the locks are not
granted, the transaction rolls back and waits until all the locks are granted.

Two-Phase Locking 2PL


This locking protocol divides the execution phase of a transaction into three parts. In the first part, when the
transaction starts executing, it seeks permission for the locks it requires. The second part is where the transaction
acquires all the locks. As soon as the transaction releases its first lock, the third phase starts. In this phase, the
transaction cannot demand any new locks; it only releases the acquired locks.

Two-phase locking has two phases, one is growing, where all the locks are being acquired by the transaction; and the
second phase is shrinking, where the locks held by the transaction are being released.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 25

To claim an exclusive (write) lock, a transaction must first acquire a shared (read) lock and then upgrade it to an
exclusive lock.

Strict Two-Phase Locking


The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the first phase, the transaction continues to
execute normally. But in contrast to 2PL, Strict-2PL does not release a lock after using it. Strict-2PL holds all the locks
until the commit point and releases all the locks at a time.

Strict-2PL does not have cascading abort as 2PL does.

Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol. This protocol uses either system time
or logical counter as a timestamp.

Lock-based protocols manage the order between the conflicting pairs among transactions at the time of execution,
whereas timestamp-based protocols start working as soon as a transaction is created.

Every transaction has a timestamp associated with it, and the ordering is determined by the age of the transaction. A
transaction created at 0002 clock time would be older than all other transactions that come after it. For example, any
transaction 'y' entering the system at 0004 is two seconds younger and the priority would be given to the older one.

In addition, every data item is given the latest read and write-timestamp. This lets the system know when the last ‘read
and write’ operation was performed on the data item.

Timestamp Ordering Protocol


The timestamp-ordering protocol ensures serializability among transactions in their conflicting read and write
operations. This is the responsibility of the protocol system that the conflicting pair of tasks should be executed
according to the timestamp values of the transactions.

 The timestamp of transaction Ti is denoted as TS(Ti).


 Read time-stamp of data-item X is denoted by R-timestamp(X).
 Write time-stamp of data-item X is denoted by W-timestamp(X).
Timestamp ordering protocol works as follows −

 If a transaction Ti issues a read(X) operation −

o If TS(Ti) < W-timestamp(X)


 Operation rejected.
o If TS(Ti) >= W-timestamp(X)
 Operation executed.
o All data-item timestamps updated.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 26

 If a transaction Ti issues a write(X) operation −

o If TS(Ti) < R-timestamp(X)


 Operation rejected.
o If TS(Ti) < W-timestamp(X)
 Operation rejected and Ti rolled back.
o Otherwise, operation executed.
Thomas' Write Rule
This rule states if TS(Ti) < W-timestamp(X), then the operation is rejected and Ti is rolled back.

Time-stamp ordering rules can be modified to make the schedule view serializable.

Instead of making Ti rolled back, the 'write' operation itself is ignored.

DBMS - Deadlock
In a multi-process system, deadlock is an unwanted situation that arises in a shared resource environment, where a
process indefinitely waits for a resource that is held by another process.

For example, assume a set of transactions {T0, T1, T2, ...,Tn}. T0 needs a resource X to complete its task. Resource X is
held by T1, and T1 is waiting for a resource Y, which is held by T2. T2 is waiting for resource Z, which is held by T0. Thus,
all the processes wait for each other to release resources. In this situation, none of the processes can finish their task.
This situation is known as a deadlock.

Deadlocks are not healthy for a system. In case a system is stuck in a deadlock, the transactions involved in the
deadlock are either rolled back or restarted.

Deadlock Prevention
To prevent any deadlock situation in the system, the DBMS aggressively inspects all the operations, where transactions
are about to execute. The DBMS inspects the operations and analyzes if they can create a deadlock situation. If it finds
that a deadlock situation might occur, then that transaction is never allowed to be executed.

There are deadlock prevention schemes that use timestamp ordering mechanism of transactions in order to
predetermine a deadlock situation.

Wait-Die Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is already held with a conflicting lock by
another transaction, then one of the two possibilities may occur −

 If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older than Tj − then Ti is allowed to wait until
the data-item is available.

 If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies. Ti is restarted later with a random delay but with the
same timestamp.

This scheme allows the older transaction to wait but kills the younger one.

Wound-Wait Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is already held with conflicting lock by
some another transaction, one of the two possibilities may occur −

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 27

 If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is TiwoundsTj. Tj is restarted later with a random delay
but with the same timestamp.

 If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.

This scheme, allows the younger transaction to wait; but when an older transaction requests an item held by a younger
one, the older transaction forces the younger one to abort and release the item.

In both the cases, the transaction that enters the system at a later stage is aborted.

Deadlock Avoidance
Aborting a transaction is not always a practical approach. Instead, deadlock avoidance mechanisms can be used to
detect any deadlock situation in advance. Methods like "wait-for graph" are available but they are suitable for only
those systems where transactions are lightweight having fewer instances of resource. In a bulky system, deadlock
prevention techniques may work well.

Wait-for Graph
This is a simple method available to track if any deadlock situation may arise. For each transaction entering into the
system, a node is created. When a transaction Ti requests for a lock on an item, say X, which is held by some other
transaction Tj, a directed edge is created from Ti to Tj. If Tj releases item X, the edge between them is dropped and
Ti locks the data item.

The system maintains this wait-for graph for every transaction waiting for some data items held by others. The system
keeps checking if there's any cycle in the graph.

Here, we can use any of the two following approaches −

 First, do not allow any request for an item, which is already locked by another transaction. This is not always
feasible and may cause starvation, where a transaction indefinitely waits for a data item and can never acquire
it.

 The second option is to roll back one of the transactions. It is not always feasible to roll back the younger
transaction, as it may be important than the older one. With the help of some relative algorithm, a transaction
is chosen, which is to be aborted. This transaction is known as the victim and the process is known as victim
selection.

DBMS - Data Backup

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 28

Loss of Volatile Storage


A volatile storage like RAM stores all the active logs, disk buffers, and related data. In addition, it stores all the
transactions that are being currently executed. What happens if such a volatile storage crashes abruptly? It would
obviously take away all the logs and active copies of the database. It makes recovery almost impossible, as everything
that is required to recover the data is lost.

Following techniques may be adopted in case of loss of volatile storage −

 We can have checkpoints at multiple stages so as to save the contents of the database periodically.

 A state of active database in the volatile memory can be periodically dumped onto a stable storage, which may
also contain logs and active transactions and buffer blocks.

 <dump> can be marked on a log file, whenever the database contents are dumped from a non-volatile memory
to a stable one.

Recovery
 When the system recovers from a failure, it can restore the latest dump.

 It can maintain a redo-list and an undo-list as checkpoints.

 It can recover the system by consulting undo-redo lists to restore the state of all transactions up to the last
checkpoint.

Database Backup & Recovery from Catastrophic Failure


A catastrophic failure is one where a stable, secondary storage device gets corrupt. With the storage device, all the
valuable data that is stored inside is lost. We have two different strategies to recover data from such a catastrophic
failure −

 Remote backup &minu; Here a backup copy of the database is stored at a remote location from where it can be
restored in case of a catastrophe.

 Alternatively, database backups can be taken on magnetic tapes and stored at a safer place. This backup can
later be transferred onto a freshly installed database to bring it to the point of backup.

Grown-up databases are too bulky to be frequently backed up. In such cases, we have techniques where we can restore
a database just by looking at its logs. So, all that we need to do here is to take a backup of all the logs at frequent
intervals of time. The database can be backed up once a week, and the logs being very small can be backed up every
day or as frequently as possible.

Remote Backup
Remote backup provides a sense of security in case the primary location where the database is located gets destroyed.
Remote backup can be offline or real-time or online. In case it is offline, it is maintained manually.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 29

Online backup systems are more real-time and lifesavers for database administrators and investors. An online backup
system is a mechanism where every bit of the real-time data is backed up simultaneously at two distant places. One of
them is directly connected to the system and the other one is kept at a remote place as backup.

As soon as the primary database storage fails, the backup system senses the failure and switches the user system to the
remote storage. Sometimes this is so instant that the users can’t even realize a failure.

DBMS - Data Recovery

Crash Recovery
DBMS is a highly complex system with hundreds of transactions being executed every second. The durability and
robustness of a DBMS depends on its complex architecture and its underlying hardware and system software. If it fails
or crashes amid transactions, it is expected that the system would follow some sort of algorithm or techniques to
recover lost data.

Failure Classification
To see where the problem has occurred, we generalize a failure into various categories, as follows −

Transaction failure
A transaction has to abort when it fails to execute or when it reaches a point from where it can’t go any further. This is
called transaction failure where only a few transactions or processes are hurt.

Reasons for a transaction failure could be −

 Logical errors − Where a transaction cannot complete because it has some code error or any internal error
condition.

 System errors − Where the database system itself terminates an active transaction because the DBMS is not
able to execute it, or it has to stop because of some system condition. For example, in case of deadlock or
resource unavailability, the system aborts an active transaction.

System Crash
There are problems − external to the system − that may cause the system to stop abruptly and cause the system to
crash. For example, interruptions in power supply may cause the failure of underlying hardware or software failure.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 30

Examples may include operating system errors.

Disk Failure
In early days of technology evolution, it was a common problem where hard-disk drives or storage drives used to fail
frequently.

Disk failures include formation of bad sectors, unreachability to the disk, disk head crash or any other failure, which
destroys all or a part of disk storage.

Storage Structure
We have already described the storage system. In brief, the storage structure can be divided into two categories −

 Volatile storage − As the name suggests, a volatile storage cannot survive system crashes. Volatile storage
devices are placed very close to the CPU; normally they are embedded onto the chipset itself. For example,
main memory and cache memory are examples of volatile storage. They are fast but can store only a small
amount of information.

 Non-volatile storage − These memories are made to survive system crashes. They are huge in data storage
capacity, but slower in accessibility. Examples may include hard-disks, magnetic tapes, flash memory, and non-
volatile (battery backed up) RAM.

Recovery and Atomicity


When a system crashes, it may have several transactions being executed and various files opened for them to modify
the data items. Transactions are made of various operations, which are atomic in nature. But according to ACID
properties of DBMS, atomicity of transactions as a whole must be maintained, that is, either all the operations are
executed or none.

When a DBMS recovers from a crash, it should maintain the following −

 It should check the states of all the transactions, which were being executed.

 A transaction may be in the middle of some operation; the DBMS must ensure the atomicity of the transaction
in this case.

 It should check whether the transaction can be completed now or it needs to be rolled back.

 No transactions would be allowed to leave the DBMS in an inconsistent state.

There are two types of techniques, which can help a DBMS in recovering as well as maintaining the atomicity of a
transaction −

 Maintaining the logs of each transaction, and writing them onto some stable storage before actually modifying
the database.

 Maintaining shadow paging, where the changes are done on a volatile memory, and later, the actual database is
updated.

Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction. It is important that
the logs are written prior to the actual modification and stored on a stable storage media, which is failsafe.

Log-based recovery works as follows −

 The log file is kept on a stable storage media.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 31

 When a transaction enters the system and starts execution, it writes a log about it.

<Tn, Start>

 When the transaction modifies an item X, it write logs as follows −

<Tn, X, V1, V2>

It reads Tn has changed the value of X, from V1 to V2.

 When the transaction finishes, it logs −


<Tn, commit>

The database can be modified using two approaches −

 Deferred database modification − All logs are written on to the stable storage and the database is updated
when a transaction commits.

 Immediate database modification − Each log follows an actual database modification. That is, the database is
modified immediately after every operation.

Recovery with Concurrent Transactions


When more than one transaction are being executed in parallel, the logs are interleaved. At the time of recovery, it
would become hard for the recovery system to backtrack all logs, and then start recovering. To ease this situation, most
modern DBMS use the concept of 'checkpoints'.

Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all the memory space available in the
system. As time passes, the log file may grow too big to be handled at all. Checkpoint is a mechanism where all the
previous logs are removed from the system and stored permanently in a storage disk. Checkpoint declares a point
before which the DBMS was in consistent state, and all the transactions were committed.

Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the following manner −

 The recovery system reads the logs backwards from the end to the last checkpoint.

 It maintains two lists, an undo-list and a redo-list.

 If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it puts the transaction
in the redo-list.

 If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the transaction in
undo-list.

All the transactions in the undo-list are then undone and their logs are removed. All the transactions in the redo-list
and their previous logs are removed and then redone before saving their logs.

Introduction to Database Keys


Keys are very important part of Relational database model. They are used to establish and identify relationships
between tables and also to uniquely identify any record or row of data inside a table.
A Key can be a single attribute or a group of attributes, where the combination may act as a key.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 32

Why we need a Key?


In real world applications, number of tables required for storing the data is huge, and the different tables are
related to each other as well.
Also, tables store a lot of data in them. Tables generally extends to thousands of records stored in them, unsorted
and unorganized.
Now to fetch any particular record from such dataset, you will have to apply some conditions, but what if there is
duplicate data present and every time you try to fetch some data by applying certain condition, you get the wrong
data. How many trials before you get the right data?
To avoid all this, Keys are defined to easily identify any row of data in a table.
Let's try to understand about all the keys using a simple example.

student_id name phone age

1 Abdullah 9876723452 17

2 Abdul 9991165674 19

3 Wasim 7898756543 18

4 Malik 8987867898 19

5 Bakar 9990080080 17

Let's take a simple Student table, with fields student_id, name, phone and age.

Super Key
Super Key is defined as a set of attributes within a table that can uniquely identify each record within a table. Super
Key is a superset of Candidate key.
In the table defined above super key would include student_id, (student_id, name), phoneetc.
Confused? The first one is pretty simple as student_id is unique for every row of data, hence it can be used to
identity each row uniquely.
Next comes, (student_id, name), now name of two students can be same, but their student_idcan't be same hence this
combination can also be a key.
Similarly, phone number for every student will be unique, hence again, phone can also be a key.
So they all are super keys.

Candidate Key
Candidate keys are defined as the minimal set of fields which can uniquely identify each record in a table. It is an
attribute or a set of attributes that can act as a Primary Key for a table to uniquely identify each record in that
table. There can be more than one candidate key.
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 33

In our example, student_id and phone both are candidate keys for table Student.

 A candiate key can never be NULL or empty. And its value should be unique.
 There can be more than one candidate keys for a table.
 A candidate key can be a combination of more than one columns(attributes).

Primary Key
Primary key is a candidate key that is most appropriate to become the main key for any table. It is a key that can
uniquely identify each record in a table.

For the table Student we can make the student_id column as the primary key.

Composite Key
Key that consists of two or more attributes that uniquely identify any record in a table is called Composite key. But
the attributes which together form the Composite key are not a key independently or individually.

In the above picture we have a Score table which stores the marks scored by a student in a particular subject.
In this table student_id and subject_id together will form the primary key, hence it is a composite key.

Secondary or Alternative key


The candidate key which are not selected as primary key are known as secondary keys or alternative keys.

Non-key Attributes
Non-key attributes are the attributes or fields of a table, other than candidate key attributes/fields in a table.
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 34

Non-prime Attributes
Non-prime Attributes are attributes other than Primary Key attribute(s)..

Introduction to SQL
Structure Query Language(SQL) is a database query language used for storing and managing data in Relational
DBMS. SQL was the first commercial language introduced for E.F Codd's Relational model of database. Today
almost all RDBMS(MySql, Oracle, Infomix, Sybase, MS Access) use SQL as the standard database query language.
SQL is used to perform all types of data operations in RDBMS.

SQL Command
SQL defines following ways to manipulate data stored in an RDBMS.
DDL: Data Definition Language
This includes changes to the structure of the table like creation of table, altering table, deleting a table etc.
All DDL commands are auto-committed. That means it saves all the changes permanently in the database.

Command Description

create to create new table or database

alter for alteration

truncate delete data from table

drop to drop a table

rename to rename a table

DML: Data Manipulation Language


DML commands are used for manipulating the data stored in the table and not the table itself.
DML commands are not auto-committed. It means changes are not permanent to database, they can be rolled
back.

Command Description

insert to insert a new row

update to update existing row

delete to delete a row

merge merging two rows or two tables

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 35

TCL: Transaction Control Language


These commands are to keep a check on other commands and their effect on the database. These commands can
annul changes made by other commands by rolling the data back to its original state. It can also make any
temporary change permanent.

Command Description

commit to permanently save

rollback to undo change

savepoint to save temporarily

DCL: Data Control Language


Data control language are the commands to grant and take back authority from any database user.

Command Description

grant grant permission of right

revoke take back permission.

DQL: Data Query Language


Data query language is used to fetch data from tables based on conditions that we can easily apply.

Command Description

select retrieve records from one or more table

SQL: create command


create is a DDL SQL command used to create a table or a database in relational database management system.

Creating a Database
To create a database in RDBMS, create command is used. Following is the syntax,
CREATE DATABASE <DB_NAME>;

Example for creating Database


CREATE DATABASE Test;
The above command will create a database named Test, which will be an empty schema without any table.
To create tables in this newly created database, we can again use the create command.
Creating a Table
create command can also be used to create tables. Now when we create a table, we have to specify the details of
the columns of the tables too. We can specify the names and datatypes of various columns in the create command
itself.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 36

Following is the syntax,


CREATE TABLE <TABLE_NAME>
(
column_name1 datatype1,
column_name2 datatype2,
column_name3 datatype3,
column_name4 datatype4
);

create table command will tell the database system to create a new table with the given table name and column
information.

Example for creating Table


CREATE TABLE Student(
student_id INT,
name VARCHAR(100),
age INT);

The above command will create a new table with name Student in the current database with 3 columns,
namely student_id, name and age. Where the column student_id will only store integer, name will hold upto 100
characters and age will again store only integer value.
If you are currently not logged into your database in which you want to create the table then you can also add the
database name along with table name, using a dot operator .
For example, if we have a database with name Test and we want to create a table Student in it, then we can do so
using the following query:
CREATE TABLE Test.Student(
student_id INT,
name VARCHAR(100),
age INT);

Most commonly used datatypes for Table columns


Here we have listed some of the most commonly used datatypes used for columns in tables.

Datatype Use

INT used for columns which will store integer values.

FLOAT used for columns which will store float values.

DOUBLE used for columns which will store float values.

VARCHAR used for columns which will be used to store characters and integers, basically a string.

CHAR used for columns which will store char values(single character).

DATE used for columns which will store date values.

TEXT used for columns which will store text which is generally long in length. For example, if
you create a table for storing profile information of a social networking website, then
for about me section you can have a column of type TEXT.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 37

SQL: ALTER command


alter command is used for altering the table structure, such as,

 to add a column to existing table


 to rename any existing column
 to change datatype of any column or to modify its size.
 to drop a column from the table.

ALTER Command: Add a new Column


Using ALTER command we can add a column to any existing table. Following is the syntax,
ALTER TABLE table_nameADD(
column_namedatatype);
Here is an Example for this,
ALTER TABLE student ADD(
address VARCHAR(200)
);
The above command will add a new column address to the table student, which will hold data of type varchar which
is nothing but string, of length 200.
ALTER Command: Add multiple new Columns
Using ALTER command we can even add multiple new columns to any existing table. Following is the syntax,
ALTER TABLE table_nameADD(
column_name1 datatype1,
column-name2 datatype2,
column-name3 datatype3);
Here is an Example for this,
ALTER TABLE student ADD(
father_name VARCHAR(60),
mother_name VARCHAR(60),
dob DATE);
The above command will add three new columns to the student table
ALTER Command: Add Column with default value
ALTER command can add a new column to an existing table with a default value too. The default value is used when
no value is inserted in the column. Following is the syntax,
ALTER TABLE table_nameADD(
column-name1 datatype1 DEFAULT some_value
);
Here is an Example for this,

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 38

ALTER TABLE student ADD(


dob DATE DEFAULT '01-Jan-99'
);
The above command will add a new column with a preset default value to the table student.
ALTER Command: Modify an existing Column
ALTER command can also be used to modify data type of any existing column. Following is the syntax,

ALTER TABLE table_namemodify(


column_namedatatype
);
Here is an Example for this,
ALTER TABLE student MODIFY(
addressvarchar(300));
Remember we added a new column address in the beginning? The above command will modify the address column
of the student table, to now hold upto 300 characters.
ALTER Command: Rename a Column
Using ALTER command you can rename an existing column. Following is the syntax,
ALTER TABLE table_name RENAME
old_column_name TO new_column_name;
Here is an example for this,
ALTER TABLE student RENAME
address TO location;
The above command will rename address column to location.

ALTER Command: Drop a Column


ALTER command can also be used to drop or remove columns. Following is the syntax,

ALTER TABLE table_nameDROP(


column_name);
Here is an example for this,
ALTER TABLE student DROP(
address);
The above command will drop the address column from the table student.
Truncate, Drop or Rename a Table
In this tutorial we will learn about the various DDL commands which are used to re-define the tables.
TRUNCATE command
TRUNCATE command removes all the records from a table. But this command will not destroy the table's structure.
When we use TRUNCATE command on a table its (auto-increment) primary key is also initialized. Following is its
syntax,
TRUNCATE TABLE table_name
Here is an example explaining it,
TRUNCATE TABLE student;
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 39

The above query will delete all the records from the table student.
In DML commands, we will study about the DELETE command which is also more or less same as
the TRUNCATE command. We will also learn about the difference between the two in that tutorial.

DROP command
DROP command completely removes a table from the database. This command will also destroy the table structure
and the data stored in it. Following is its syntax,
DROP TABLE table_name
Here is an example explaining it,
DROP TABLE student;
The above query will delete the Student table completely. It can also be used on Databases, to delete the complete
database. For example, to drop a database,
DROP DATABASE Test;
The above query will drop the database with name Test from the system.

RENAME query
RENAME command is used to set a new name for any existing table. Following is the syntax,

RENAME TABLE old_table_name to new_table_name


Here is an example explaining it.
RENAME TABLE student to students_info;

The above query will rename the table student to students_info.


Using INSERT SQL command
Data Manipulation Language (DML) statements are used for managing data in database. DML commands are not
auto-committed. It means changes made by DML command are not permanent to database, it can be rolled back.
Talking about the Insert command, whenever we post a Tweet on Twitter, the text is stored in some table, and as
we post a new tweet, a new record gets inserted in that table.

INSERT command
Insert command is used to insert data into a table. Following is its general syntax,
INSERT INTO table_nameVALUES(data1, data2, ...)
Lets see an example,
Consider a table student with the following fields.

s_id name age

INSERT INTO student VALUES(101, 'Abdul', 15);


The above command will insert a new record into student table.

s_id name age

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 40

101 Abdul 15

Insert value into only specific columns


We can use the INSERT command to insert values for only some specific columns of a row. We can specify the
column names along with the values to be inserted like this,
INSERT INTO student(id, name) values(102, 'Abdullah');
The above SQL query will only insert id and name values in the newly inserted record.

Insert NULL value to a column


Both the statements below will insert NULL value into age column of the student table.
INSERT INTO student(id, name) values(102, 'Abdullah');
Or,
INSERT INTO Student VALUES(102,'Abdullah', null);
The above command will insert only two column values and the other column is set to null.

S_id S_Name age

101 Adam 15

102 Abdullah

Insert Default value to a column


INSERT INTO Student VALUES(103,'Abdul', default)

S_id S_Name age

101 Adam 15

102 Abdullah

103 Abdul 14
Suppose the column age in our tabel has a default value of 14.
Also, if you run the below query, it will insert default value into the age column, whatever the default value may
be.
INSERT INTO Student VALUES(103,'Chris')

Using UPDATE SQL command


Let's take an example of a real-world problem. These days, Facebook provides an option for Editingyour status
update, how do you think it works? Yes, using the Update SQL command.
Let's learn about the syntax and usage of the UPDATE command.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 41

UPDATE command

UPDATE command is used to update any record of data in a table. Following is its general syntax,

UPDATE table_name SET column_name = new_value WHERE some_condition;


WHERE is used to add a condition to any SQL query, we will soon study about it in detail.
Lets take a sample table student,

student_id name age

101 Adam 15

102 Abdullah

103 Abdul 14

UPDATE student SET age=18 WHERE student_id=102;

S_id S_Name age

101 Adam 15

102 Alex 18

103 chris 14

In the above statement, if we do not use the WHERE clause, then our update query will update age for all the
columns of the table to 18.

Updating Multiple Columns


We can also update values of multiple columns using a single UPDATE statement.
UPDATE student SET name='Abbul', age=17 where s_id=103;
The above command will update two columns of the record which has s_id 103.

s_id name age

101 Adam 15

102 Alex 18

103 Abdul 17

UPDATE Command: Incrementing Integer Value


When we have to update any integer value in a table, then we can fetch and update the value in the table in a
single statement.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 42

For example, if we have to update the age column of student table every year for every student, then we can
simply run the following UPDATE statement to perform the following operation:
UPDATE student SET age = age+1;
As you can see, we have used age = age + 1 to increment the value of age by 1.
NOTE: This style only works for integer values.

Using DELETE SQL command


When you ask any question in Studytonight's Forum it gets saved into a table. And using the Deleteoption, you can
even delete a question asked by you. How do you think that works? Yes, using the Delete DML command.
Let's study about the syntax and the usage of the Delete command.

DELETE command

DELETE command is used to delete data from a table.

Following is its general syntax,


DELETE FROM table_name;

Let's take a sample table student:

s_id name age

101 Adam 15

102 Alex 18

103 Abdul 17

Delete all Records from a Table


DELETE FROM student;
The above command will delete all the records from the table student.

Delete a particular Record from a Table


In our student table if we want to delete a single record, we can use the WHERE clause to provide a condition in
our DELETE statement.
DELETE FROM student WHERE s_id=103;
The above command will delete the record where s_id is 103 from the table student.

S_id S_Name age

101 Adam 15

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 43

102 Alex 18

Isn't DELETE same as TRUNCATE


TRUNCATE command is different from DELETE command. The delete command will delete all the rows from a table
whereas truncate command not only deletes all the records stored in the table, but it also re-initializes the
table(like a newly created table).
For eg: If you have a table with 10 rows and an auto_increment primary key, and if you use DELETEcommand to
delete all the rows, it will delete all the rows, but will not re-initialize the primary key, hence if you will insert any
row after using the DELETE command, the auto_increment primary key will start from 11. But in case
of TRUNCATE command, primary key is re-initialized, and it will again start from 1.

Commit, Rollback and Savepoint SQL commands


Transaction Control Language(TCL) commands are used to manage transactions in the database. These are used to
manage the changes made to the data in a table by DML statements. It also allows statements to be grouped
together into logical transactions.

COMMIT command
COMMIT command is used to permanently save any transaction into the database.

When we use any DML command like INSERT, UPDATE or DELETE, the changes made by these commands are not
permanent, until the current session is closed, the changes made by these commands can be rolled back.
To avoid that, we use the COMMIT command to mark the changes as permanent.
Following is commit command's syntax,
COMMIT;

ROLLBACK command
This command restores the database to last commited state. It is also used with SAVEPOINT command to jump to a
savepoint in an ongoing transaction.
If we have used the UPDATE command to make some changes into the database, and realise that those changes
were not required, then we can use the ROLLBACK command to rollback those changes, if they were not commited
using the COMMIT command.
Following is rollback command's syntax,
ROLLBACK TO savepoint_name;

SAVEPOINT command

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 44

SAVEPOINT command is used to temporarily save a transaction so that you can rollback to that point whenever
required.
Following is savepoint command's syntax,
SAVEPOINT savepoint_name;
In short, using this command we can name the different states of our data in any table and then rollback to that
state using the ROLLBACK command whenever required.

Using Savepoint and Rollback


Following is the table class,

id name

1 Abdul

2 Adam

4 Alex

Let’s use some SQL queries on the above table and see the results.
INSERT INTO class VALUES(5, 'Abdul');

COMMIT;

UPDATE class SET name = 'Abdulit' WHERE id = '5';

SAVEPOINT A;

INSERT INTO class VALUES(6, 'Chris');

SAVEPOINT B;

INSERT INTO class VALUES(7, 'Bravo');

SAVEPOINT C;

SELECT * FROM class;


NOTE: SELECT statement is used to show the data stored in the table.
The resultant table will look like,

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 45

id name

1 Abdul

2 Adam

4 Alex

5 abdulit

6 Chris

7 Bravo

Now let's use the ROLLBACK command to roll back the state of data to the savepoint B.
ROLLBACK TO B;

SELECT * FROM class;


Now our class table will look like,

id name

1 Abdul

2 Adam

4 Alex

5 Abdulit

6 Chris

Now let's again use the ROLLBACK command to roll back the state of data to the savepoint A
ROLLBACK TO A;

SELECT * FROM class;


Now the table will look like,

id name

1 Abdul

2 Adam

4 Alex

5 Abduljit
So now you know how the commands COMMIT, ROLLBACK and SAVEPOINT works.

Using GRANT and REVOKE


Data Control Language(DCL) is used to control privileges in Database. To perform any operation in the database,
such as for creating tables, sequences or views, a user needs privileges. Privileges are of two types,

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 46

 System: This includes permissions for creating session, table, etc and all types of other system privileges.
 Object: This includes permissions for any command or query to perform any operation on the database tables.

In DCL we have two commands,

 GRANT: Used to provide any user access privileges or other priviliges for the database.

 REVOKE: Used to take back permissions from any user.

Allow a User to create session


When we create a user in SQL, it is not even allowed to login and create a session until and unless proper
permissions/priviliges are granted to the user.
Following command can be used to grant the session creating priviliges.
GRANT CREATE SESSION TO username;

Allow a User to create table


To allow a user to create tables in the database, we can use the below command,
GRANT CREATE TABLE TO username;

Provide user with space on tablespace to store table


Allowing a user to create table is not enough to start storing data in that table. We also must provide the user with
priviliges to use the available tablespace for their table and data.
ALTER USER username QUOTA UNLIMITED ON SYSTEM;
The above command will alter the user details and will provide it access to unlimited tablespace on system.
NOTE: Generally unlimited quota is provided to Admin users.

Grant all privilege to a User


sysdba is a set of priviliges which has all the permissions in it. So if we want to provide all the privileges to any user,
we can simply grant them the sysdba permission.
GRANT sysdba TO username

Grant permission to create any table


Sometimes user is restricted from creating come tables with names which are reserved for system tables. But we
can grant privileges to a user to create any table using the below command,
GRANT CREATE ANY TABLE TO username

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 47

Grant permission to drop any table


As the title suggests, if you want to allow user to drop any table from the database, then grant this privilege to the
user,
GRANT DROP ANY TABLE TO username

To take back Permissions


And, if you want to take back the privileges from any user, use the REVOKE command.
REVOKE CREATE TABLE FROM username

Using the WHERE SQL clause


WHERE clause is used to specify/apply any condition while retrieving, updating or deleting data from a table. This
clause is used mostly with SELECT, UPDATE and DELETEquery.
When we specify a condition using the WHERE clause then the query executes only for those records for which the
condition specified by the WHERE clause is true.

Syntax for WHERE clause


Here is how you can use the WHERE clause with a DELETE statement, or any other statement,
DELETE FROM table_name WHERE [condition];
The WHERE clause is used at the end of any SQL query, to specify a condition for execution.

Time for an Example


Consider a table student,

s_id name age address

101 Adam 15 England

102 Alex 18 New York

103 Abdul 17 Jauharabad

104 Abdullah 22 Pakistan

Now we will use the SELECT statement to display data of the table, based on a condition, which we will add to
our SELECT query using WHERE clause.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 48

Let's write a simple SQL query to display the record for student with s_id as 101.
SELECT s_id,
name,
age,
address
FROM student WHERE s_id = 101;
Following will be the result of the above query.

s_id name age address

101 Adam 15 England

Applying condition on Text Fields


In the above example we have applied a condition to an integer value field, but what if we want to apply the
condition on name field. In that case we must enclose the value in single quote ' '. Some databases even accept
double quotes, but single quotes is accepted by all.
SELECT s_id,
name,
age,
address
FROM student WHERE name = 'Abdul';
Following will be the result of the above query.

s_id name age address

103 Abdul 17 Jauharabad

Operators for WHERE clause condition


Following is a list of operators that can be used while specifying the WHERE clause condition.

Operator Description

= Equal to

!= Not Equal to

< Less than

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 49

> Greater than

<= Less than or Equal to

>= Greate than or Equal to

BETWEEN Between a specified range of values

LIKE This is used to search for a pattern in value.

IN In a given set of values

SQL LIKE clause


LIKE clause is used in the condition in SQL query with the WHERE clause. LIKE clause compares data with an
expression using wildcard operators to match pattern given in the condition.

Wildcard operators

There are two wildcard operators that are used in LIKE clause.

 Percent sign %: represents zero, one or more than one character.


 Underscore sign _: represents only a single character.

Example of LIKE clause


Consider the following Student table.

s_id s_Name age

101 Adam 15

102 Alex 18

103 Abdul 17
SELECT * from Student where s_name like 'A%';

The above query will return all records where s_name starts with character 'A'.

s_id s_Name age

101 Adam 15

102 Alex 18

103 Abdul 17

Example
SELECT * from Student where s_name like '_d%';

The above query will return all records from Student table where s_name contain 'b' as second character.
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 50

s_id s_Name age

101 Adullah 15

Example
SELECT * from Student where s_name like '%x';

The above query will return all records from Student table where s_name contain 'x' as last character.

s_id s_Name age

102 Alex 18

Order By Clause
Order by clause is used with Select statement for arranging retrieved data in sorted order. The Order by clause by
default sort data in ascending order. To sort data in descending order DESC keyword is used with Order by clause.

Syntax of Order By
SELECT column-list|* from table-name order byasc|desc;

Example using Order by


Consider the following Emp table,

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Rohan 34 6000

404 Scott 44 10000

405 Tiger 35 8000


SELECT * from Emporder by salary;

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 51

The above query will return result in ascending order of the salary.

eid name age salary

403 Rohan 34 6000

402 Shane 29 8000

405 Tiger 35 8000

401 Abdul 22 9000

404 Scott 44 10000

Example of Order by DESC


Consider the Emp table described above,
SELECT * from Emp order by salary DESC;

The above query will return result in descending order of the salary.

eid name age salary

404 Scott 44 10000

401 Abdul 22 9000

405 Tiger 35 8000

402 Shane 29 8000

403 Rohan 34 6000

Group By Clause
Group by clause is used to group the results of a SELECT query based on one or more columns. It is also used with
SQL functions to group the result from one or more tables.

Syntax for using Group by in a statement.


SELECT column_name, function(column_name)
FROM table_name
WHERE condition
GROUP BY column_name

Example of Group by in a Statement


Consider the following Emp table.

eid name age salary

401 Abdul 22 9000

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 52

402 Shane 29 8000

403 Rohan 34 6000

404 Scott 44 9000

405 Tiger 35 8000


Here we want to find name and age of employees grouped by their salaries
SQL query for the above requirement will be,
SELECT name, age

fromEmpgroup by salary

Result will be,

name age

Rohan 34

shane 29

Abdul 22

Example of Group by in a Statement with WHERE clause


Consider the following Emp table

eid name age salary

401 Anu 22 9000

402 Shane 29 8000

403 Rohan 34 6000

404 Scott 44 9000

405 Tiger 35 8000

SQL query will be,


select name, salary

fromEmp

where age > 25

group by salary

Result will be.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 53

name salary

Rohan 6000

Shane 8000

Scott 9000
You must remember that Group By clause will always come at the end, just like the Order by clause.

HAVING Clause
having clause is used with SQL Queries to give more precise condition for a statement. It is used to mention
condition in Group based SQL functions, just like WHERE clause.
Syntax for having will be,
selectcolumn_name, function(column_name)

FROM table_name

WHERE column_name condition

GROUP BY column_name

HAVINGfunction(column_name) condition

Example of HAVING Statement


Consider the following Sale table.

oid order_name previous_balance customer

11 ord1 2000 Alex

12 ord2 1000 Adam

13 ord3 2000 Abdul

14 ord4 1000 Adam

15 ord5 2000 Alex


Suppose we want to find the customer whose previous_balance sum is more than 3000.
We will use the below SQL query,
SELECT *

from sale group customer

having sum(previous_balance) > 3000

Result will be,

oid order_name previous_balance customer

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 54

11 ord1 2000 Alex

Distinct keyword
The distinct keyword is used with Select statement to retrieve unique values from the table. Distinctremoves all
the duplicate records while retrieving from database.

Syntax for DISTINCT Keyword


SELECTdistinct column-name from table-name;

Example
Consider the following Emp table.

eid name age salary

401 Abdul 22 5000

402 Shane 29 8000

403 Rohan 34 10000

404 Scott 44 10000

405 Tiger 35 8000


select distinct salary from Emp;

The above query will return only the unique salary from Emp table

salary

5000

8000

10000

AND & OR operator


AND and OR operators are used with Where clause to make more precise conditions for fetching data from
database by combining more than one condition together.

AND operator
AND operator is used to set multiple conditions with Where clause.
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 55

Example of AND
Consider the following Emp table

eid name age salary

401 Abdul 22 5000

402 Shane 29 8000

403 Rohan 34 12000

404 Scott 44 10000

405 Tiger 35 9000


SELECT * from Emp WHERE salary < 10000 AND age > 25

The above query will return records where salary is less than 10000 and age greater than 25.

eid name age salary

402 Shane 29 8000

405 Tiger 35 9000

OR operator
OR operator is also used to combine multiple conditions with Where clause. The only difference between AND and
OR is their behaviour. When we use AND to combine two or more than two conditions, records satisfying all the
condition will be in the result. But in case of OR, atleast one condition from the conditions specified must be
satisfied by any record to be in the result.

Example of OR

Consider the following Emp table

eid name age salary

401 Abdul 22 5000

402 Shane 29 8000

403 Rohan 34 12000

404 Scott 44 10000

405 Tiger 35 9000


SELECT * from Emp WHERE salary > 10000 OR age > 25

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 56

The above query will return records where either salary is greater than 10000 or age greater than 25.

402 Shane 29 8000

403 Rohan 34 12000

404 Scott 44 10000

405 Tiger 35 9000

SQL Constraints
SQL Constraints are rules used to limit the type of data that can go into a table, to maintain the accuracy and
integrity of the data inside table.
Constraints can be divided into following two types,

 Column level constraints : limits only column data


 Table level constraints : limits whole table data

Constraints are used to make sure that the integrity of data is maintained in the database. Following are the most
used constraints that can be applied to a table.

 NOT NULL
 UNIQUE
 PRIMARY KEY
 FOREIGN KEY
 CHECK
 DEFAULT

NOT NULL Constraint


NOT NULL constraint restricts a column from having a NULL value. Once NOT NULL constraint is applied to a
column, you cannot pass a null value to that column. It enforces a column to contain a proper value. One important
point to note about NOT NULL constraint is that it cannot be defined at table level.

Example using NOT NULL constraint


CREATE table Student(s_idint NOT NULL, Name varchar(60), Age int);

The above query will declare that the s_id field of Student table will not take NULL value.

UNIQUE Constraint
UNIQUE constraint ensures that a field or column will only have unique values. A UNIQUE constraint field will not
have duplicate data. UNIQUE constraint can be applied at column level or table level.

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 57

Example using UNIQUE constraint when creating a Table (Table Level)


CREATE table Student(s_idint NOT NULL UNIQUE, Name varchar(60), Age int);

The above query will declare that the s_id field of Student table will only have unique values and wont take NULL
value.

Example using UNIQUE constraint after Table is created (Column Level)


ALTER table Student add UNIQUE(s_id);

The above query specifies that s_id field of Student table will only have unique value.

Primary Key Constraint


Primary key constraint uniquely identifies each record in a database. A Primary Key must contain unique value and
it must not contain null value. Usually Primary Key is used to index the data inside the table.

Example using PRIMARY KEY constraint at Table Level


CREATE table Student (s_idintPRIMARY KEY, Name varchar(60) NOT NULL, Age int);

The above command will creates a PRIMARY KEY on the s_id.

Example using PRIMARY KEY constraint at Column Level


ALTER table Student add PRIMARY KEY (s_id);

The above command will creates a PRIMARY KEY on the s_id.

Foreign Key Constraint


FOREIGN KEY is used to relate two tables. FOREIGN KEY constraint is also used to restrict actions that would
destroy links between tables. To understand FOREIGN KEY, let's see it using two table.
Customer_DetailTable :

c_id Customer_Name address

101 Adam Noida

102 Alex Delhi

103 Stuart Rohtak


Order_DetailTable :

Order_id Order_Name c_id

10 Order1 101

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 58

11 Order2 103

12 Order3 102
In Customer_Detail table, c_id is the primary key which is set as foreign key in Order_Detail table. The value that is
entered in c_id which is set as foreign key in Order_Detail table must be present in Customer_Detail table where it
is set as primary key. This prevents invalid data to be inserted into c_id column of Order_Detail table.

Example using FOREIGN KEY constraint at Table Level


CREATE table Order_Detail(order_idint PRIMARY KEY,
order_namevarchar(60) NOT NULL,
c_idintFOREIGN KEY REFERENCES Customer_Detail(c_id));
In this query, c_id in table Order_Detail is made as foriegn key, which is a reference of c_id column of
Customer_Detail.

Example using FOREIGN KEY constraint at Column Level


ALTER table Order_Detail add FOREIGN KEY (c_id) REFERENCES Customer_Detail(c_id);

Behaviour of Foriegn Key Column on Delete


There are two ways to maintin the integrity of data in Child table, when a particular record is deleted in main table.
When two tables are connected with Foriegn key, and certain data in the main table is deleted, for which record
exit in child table too, then we must have some mechanism to save the integrity of data in child table.

 On Delete Cascade : This will remove the record from child table, if that value of foriegn key is deleted from
the main table.
 On Delete Null : This will set all the values in that record of child table as NULL, for which the value of foriegn
key is deleted from the main table.
 If we don't use any of the above, then we cannot delete data from the main table for which data in child table
exists. We will get an error if we try to do so.

ERROR : Record in child table exist

CHECK Constraint

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 59

CHECK constraint is used to restrict the value of a column between a range. It performs check on the values, before
storing them into the database. Its like condition checking before saving data into a column.

Example using CHECK constraint at Table Level


create table Student(s_idint NOT NULL CHECK(s_id> 0),

Name varchar(60) NOT NULL,

Age int);

The above query will restrict the s_id value to be greater than zero.

Example using CHECK constraint at Column Level


ALTER table Student add CHECK(s_id> 0);

SQL Functions
SQL provides many built-in functions to perform operations on data. These functions are useful while performing
mathematical calculations, string concatenations, sub-strings etc. SQL functions are divided into two catagories,

 Aggregrate Functions
 Scalar Functions

Aggregrate Functions
These functions return a single value after calculating from a group of values.Following are some frequently used
Aggregrate functions.

1) AVG()
Average returns average value after calculating from values in a numeric column.
Its general Syntax is,
SELECT AVG(column_name) from table_name

Example using AVG()


Consider following Emp table

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Abdullah 34 6000

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 60

404 Scott 44 10000

405 Tiger 35 8000


SQL query to find average of salary will be,
SELECT avg(salary) from Emp;

Result of the above query will be,

avg(salary)

8200

2) COUNT()
Count returns the number of rows present in the table either based on some condition or without condition.
Its general Syntax is,
SELECT COUNT(column_name) from table-name

Example using COUNT()


Consider following Emp table

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Abdullah 34 6000

404 Scott 44 10000

405 Tiger 35 8000


SQL query to count employees, satisfying specified condition is,
SELECT COUNT(name) from Emp where salary = 8000;

Result of the above query will be,

count(name)

Example of COUNT(distinct)
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 61

Consider following Emp table

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Abdullah 34 6000

404 Scott 44 10000

405 Tiger 35 8000

SQL query is,


SELECT COUNT(distinct salary) from emp;

Result of the above query will be,

count(distinct salary)

3) FIRST()
First function returns first value of a selected column
Syntax for FIRST function is,
SELECT FIRST(column_name) from table-name

Example of FIRST()
Consider following Emp table

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Abudallah 34 6000

404 Scott 44 10000

405 Tiger 35 8000


SQL query
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 62

SELECT FIRST(salary) from Emp;

Result will be,

first(salary)

9000

4) LAST()
LAST return the return last value from selected column
Syntax of LAST function is,
SELECT LAST(column_name) from table-name

Example of LAST()
Consider following Emp table

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Abdullah 34 6000

404 Scott 44 10000

405 Tiger 35 8000


SQL query will be,
SELECT LAST(salary) from emp;

Result of the above query will be,

last(salary)

8000

5) MAX()
MAX function returns maximum value from selected column of the table.
Syntax of MAX function is,
SELECT MAX(column_name) from table-name

Example of MAX()
Consider following Emp table
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 63

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Abdullah 34 6000

404 Scott 44 10000

405 Tiger 35 8000


SQL query to find Maximum salary is,
SELECT MAX(salary) from emp;

Result of the above query will be,

MAX(salary)

10000

6) MIN()
MIN function returns minimum value from a selected column of the table.
Syntax for MIN function is,
SELECT MIN(column_name) from table-name

Example of MIN()
Consider following Emp table,

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Abdullah 34 6000

404 Scott 44 10000

405 Tiger 35 8000


SQL query to find minimum salary is,
SELECT MIN(salary) from emp;

Result will be,

MIN(salary)

6000

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 64

7) SUM()
SUM function returns total sum of a selected columns numeric values.
Syntax for SUM is,
SELECT SUM(column_name) from table-name

Example of SUM()
Consider following Emp table

eid name age salary

401 Abdul 22 9000

402 Shane 29 8000

403 Abdullah 34 6000

404 Scott 44 10000

405 Tiger 35 8000


SQL query to find sum of salaries will be,
SELECT SUM(salary) from emp;

Result of above query is,

SUM(salary)

41000

Scalar Functions
Scalar functions return a single value from an input value. Following are soe frequently used Scalar Functions.

1) UCASE()
UCASE function is used to convert value of string column to Uppercase character.
Syntax of UCASE,
SELECT UCASE(column_name) from table-name

Example of UCASE()
Consider following Emp table

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 65

eid name age salary

401 abdul 22 9000

402 shane 29 8000

403 Abdullah 34 6000

404 scott 44 10000

405 Tiger 35 8000


SQL query for using UCASE is,
SELECT UCASE(name) from emp;

Result is,

UCASE(name)

ABDUL

SHANE

ABDULLAH

SCOTT

TIGER

2) LCASE()
LCASE function is used to convert value of string column to Lowecase character.
Syntax for LCASE is,
SELECT LCASE(column_name) from table-name

Example of LCASE()
Consider following Emp table

eid name age salary

401 abdul 22 9000

402 shane 29 8000

403 abdullah 34 6000

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 66

404 scott 44 10000

405 Tiger 35 8000

SQL query for converting string value to Lower case is,


SELECT LCASE(name) from emp;

Result will be,

LCASE(name)

abdul

shane

abdullah

scott

tiger

3) MID()
MID function is used to extract substrings from column values of string type in a table.
Syntax for MID function is,
SELECT MID(column_name, start, length) from table-name

Example of MID()
Consider following Emp table

eid name age salary

401 abdul 22 9000

402 shane 29 8000

403 abdullah 34 6000

404 scott 44 10000

405 Tiger 35 8000


SQL query will be,
select MID(name,2,2) from emp;

Result will come out to be,

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 67

MID(name,2,2)

du

ha

ul

co

ig

4) ROUND()
ROUND function is used to round a numeric field to number of nearest integer. It is used on Decimal point values.
Syntax of Round function is,
SELECT ROUND(column_name, decimals) from table-name

Example of ROUND()
Consider following Emp table

eid name age salary

401 abdul 22 9000.67

402 shane 29 8000.98

403 abdullah 34 6000.45

404 scott 44 10000

405 Tiger 35 8000.01

SQL query is,


SELECT ROUND(salary) from emp;

Result will be,

ROUND(salary)

9001

8001

6000

10000

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 68

8000

Join in SQL
SQL Join is used to fetch data from two or more tables, which is joined to appear as single set of data. SQL Join is
used for combining column from two or more tables by using values common to both tables. Join Keyword is used
in SQL queries for joining two or more tables. Minimum required condition for joining table, is (n-1) where n, is
number of tables. A table can also join to itself known as, Self Join.

Types of Join
The following are the types of JOIN that we can use in SQL.

 Inner
 Outer
 Left
 Right

Cross JOIN or Cartesian Product


This type of JOIN returns the cartesian product of rows from the tables in Join. It will return a table which consists
of records which combines each row from the first table with each row of the second table.
Cross JOIN Syntax is,
SELECT column-name-list

fromtable-name1

CROSS JOIN

table-name2;

Example of Cross JOIN


The class table,

ID NAME

1 abdul

2 adam

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 69

4 alex
The class_info table,

ID Address

1 ISLAMABAD

2 KHUSHAB

3 QUETTA

Cross JOIN query will be,


SELECT *

from class,

cross JOIN class_info;

The result table will look like,

ID NAME ID Address

1 abdul 1 ISLAMABAD

2 adam 1 ISLAMABAD

4 alex 1 ISLAMABAD

1 abdul 2 KHUSHAB

2 adam 2 KHUSHAB

4 alex 2 KHUSHAB

1 abdul 3 QUETTA

2 adam 3 QUETTA

4 alex 3 QUETTA

INNER Join or EQUI Join


This is a simple JOIN in which the result is based on matched data as per the equality condition specified in the
query.
Inner Join Syntax is,
SELECT column-name-list

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 70

fromtable-name1

INNER JOIN

table-name2

WHERE table-name1.column-name = table-name2.column-name;

Example of Inner JOIN


The class table,

ID NAME

1 abdul

2 adam

3 alex

4 abdullah
The class_info table,

ID Address

1 ISLAMABAD

2 KHUSHAB

3 QUETTA
Inner JOIN query will be,
SELECT * from class, class_info where class.id = class_info.id;

The result table will look like,

ID NAME ID Address

1 abdul 1 ISLAMABAD

2 adam 2 KHUSHAB

3 alex 3 QUETTA

Natural JOIN
Natural Join is a type of Inner join which is based on column having same name and same datatype present in both
the tables to be joined.

Natural Join Syntax is,


SELECT *

fromtable-name1

NATURAL JOIN
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 71

table-name2;

Example of Natural JOIN


The class table,

ID NAME

1 abdul

2 adam

3 alex

4 abdullah
The class_info table,

ID Address

1 ISLAMABAD

2 KHUSHAB

3 QUETTA
Natural join query will be,
SELECT * from class NATURAL JOIN class_info;

The result table will look like,

ID NAME Address

1 abdul ISLAMABAD

2 adam KHUSHAB

3 alex QUETTA
In the above example, both the tables being joined have ID column(same name and same datatype), hence the
records for which value of ID matches in both the tables will be the result of Natural Join of these two tables.

Outer JOIN
Outer Join is based on both matched and unmatched data. Outer Joins subdivide further into,

 Left Outer Join


 Right Outer Join
 Full Outer Join

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 72

Left Outer Join


The left outer join returns a result table with the matched data of two tables then remaining rows of the left table
and null for the right table's column.
Left Outer Join syntax is,
SELECT column-name-list

fromtable-name1

LEFT OUTER JOIN

table-name2

on table-name1.column-name = table-name2.column-name;

Left outer Join Syntax for Oracle is,


select column-name-list

fromtable-name1,

table-name2

on table-name1.column-name = table-name2.column-name(+);

Example of Left Outer Join


The class table,

ID NAME

1 abdul

2 adam

3 alex

4 abudllah

5 ashish
The class_info table,

ID Address

1 ISLAMABAD

2 KHUSHAB

3 QUETTA

7 SIALKOT

8 PAKPATTAN

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 73

Left Outer Join query will be,


SELECT * FROM class LEFT OUTER JOIN class_info ON (class.id=class_info.id);

The result table will look like,

ID NAME ID Address

1 abdul 1 ISLAMABAD

2 adam 2 KHUSHAB

3 alex 3 QUETTA

4 abdullah null null

5 ashish null null

Right Outer Join


The right outer join returns a result table with the matched data of two tables then remaining rows of the right
table and null for the left table's columns.
Right Outer Join Syntax is,
select column-name-list

fromtable-name1

RIGHT OUTER JOIN

table-name2

on table-name1.column-name = table-name2.column-name;

Right outer Join Syntax for Oracle is,


select column-name-list

fromtable-name1,

table-name2

on table-name1.column-name(+) = table-name2.column-name;

Example of Right Outer Join


The class table,

ID NAME

1 abdul

2 adam

3 alex

4 abdullah
Msc IT (3rd Term) ILM College Jauharabad M.Wasim
P a g e | 74

5 ashish

The class_info table,

ID Address

1 ISLAMABAD

2 KHUSHAB

3 QUETTA

7 SIALKOT

8 PAKPATTAN

Right Outer Join query will be,


SELECT * FROM class RIGHT OUTER JOIN class_info on (class.id=class_info.id);

The result table will look like,

ID NAME ID Address

1 Abbul 1 ISLAMABAD

2 Adam 2 KHUSHAB

3 Alex 3 QUETTA

null Null 7 SIALKOT

null null 8 PAKPATTAN

Full Outer Join


The full outer join returns a result table with the matched data of two table then remaining rows of both left table
and then the right table.
Full Outer Join Syntax is,
select column-name-list

fromtable-name1

FULL OUTER JOIN

table-name2

on table-name1.column-name = table-name2.column-name;

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 75

Example of Full outer join is,

The class table,

ID NAME

1 abdul

2 adam

3 alex

4 abdullah

5 ashish
The class_info table,

ID Address

1 ISLAMABAD

2 KHUSHAB

3 QUETTA

7 SIALKOT

8 PAKPATTAN
Full Outer Join query will be like,
SELECT * FROM class FULL OUTER JOIN class_info on (class.id=class_info.id);

The result table will look like,

ID NAME ID Address

1 abdul 1 ISLAMABAD

2 adam 2 KHUSHAB

3 alex 3 QUETTA

4 abdullah null null

5 ashish null null

null null 7 SIALKOT

null null 8 PAKPATTAN

Msc IT (3rd Term) ILM College Jauharabad M.Wasim


P a g e | 76

Msc IT (3rd Term) ILM College Jauharabad M.Wasim

You might also like