Download as pdf or txt
Download as pdf or txt
You are on page 1of 148

Apollo Institute of Engineering and Technology

Frequently asked Question & Solution

Degree Engineering

Branch: IT Semester: 3rd

Name of Subject: DBMS Subject Code: 3130703

Unit - 1: Database system architecture

Q:1 Explain DBMS System Architecture. [summer - 2020(7 marks)]


A: 1 The architecture of a database system is greatly influenced by the underlying
computer system on which the database is running:
i. Centralized.
ii. Client-server.
iii. Parallel (multi-processor).
iv. Distributed

Database Users:

Users are differentiated by the way they expect to interact with the system:

● Application programmers:
○ Application programmers are computer professionals who write
application programs. Application programmers can choose from many
tools to develop user interfaces.
○ Rapid application development (RAD) tools are tools that enable an
application programmer to construct forms and reports without writing a
program.
● Sophisticated users:
○ Sophisticated users interact with the system without writing programs.
Instead, they form their requests in a database query language.
○ They submit each such query to a query processor, whose function is to
break down DML statements into instructions that the storage manager
understands.
● Specialized users :
○ Specialized users are sophisticated users who write specialized database
applications that do not fit into the traditional data-processing framework.
○ Among these applications are computer-aided design systems, knowledge
base and expert systems, systems that store data with complex data types
(for example, graphics data and audio data), and environment-modeling
systems.
● Naïve users :
○ Naive users are unsophisticated users who interact with the system by
invoking one of the application programs that have been written
previously.
○ For example, a bank teller who needs to transfer $50 from account A to
account B invokes a program called transfer. This program asks the teller
for the amount of money to be transferred, the account from which the
money is to be transferred, and the account to which the money is to be
transferred.

Database Administrator:

● Coordinates all the activities of the database system. The database administrator
has a good understanding of the enterprise’s information resources and needs.
● Database administrator's duties include:
○ Schema definition: The DBA creates the original database schema by
executing a set of data definition statements in the DDL.
○ Storage structure and access method definition.
○ Schema and physical organization modification: The DBA carries out
changes to the schema and physical organization to reflect the changing
needs of the organization, or to alter the physical organization to improve
performance.
○ Granting user authority to access the database: By granting different types
of authorization, the database administrator can regulate which parts of
the database various users can access.
○ Specifying integrity constraints.
○ Monitoring performance and responding to changes in requirements.

Query Processor:
The query processor will accept query from user and solves it by accessing the
database.
Parts of Query processor:
DDL interpreter
This will interprets DDL statements and fetch the definitions in the data dictionary.
DML compiler
a. This will translates DML statements in a query language into low level instructions
that the query evaluation engine understands.
b. A query can usually be translated into any of a number of alternative evaluation plans
for same query result DML compiler will select best plan for query optimization.
Query evaluation engine
This engine will execute low-level instructions generated by the DML compiler on
DBMS.
Storage Manager/Storage Management:
A storage manager is a program module which acts like interface between the data
stored in a database and the application programs and queries submitted to the system.
Thus, the storage manager is responsible for storing, retrieving and updating data in the
database.
The storage manager components include:
Authorization and integrity manager: Checks for integrity constraints and authority of
users to access data.
Transaction manager:
Ensures that the database remains in a consistent state although there are system
failures.
File manager:
Manages the allocation of space on disk storage and the data structures used to
represent information stored on disk.
Buffer manager:
It is responsible for retrieving data from disk storage into main memory. It enables the
database to handle data sizes that are much larger than the size of main memory.
Data structures implemented by storage manager.
Data files:
Stored in the database itself.
Data dictionary:
Stores metadata about the structure of the database.
Indices:
Provide fast access to data items.

Q:2 Define following terms.


1) Schema 2) Database Management System 3) Physical Data Independence
[winter - 2020(3 marks)]

A:2 schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are
associated. It formulates all the constraints that are to be applied on the data.
Database Management System: A database management system (or DBMS) is
essentially nothing more than a computerized data-keeping system. Users of the
system are given facilities to perform several kinds of operations on such a system for
either manipulation of the data in the database or the management of the database
structure itself.
Physical Data Independence: Physical Data Independence is defined as the ability to
make changes in the structure of the lowest level of the Database Management System
(DBMS) without affecting the higher-level schemas. Hence, modification in the Physical
level should not result in any changes in the Logical or View levels.
Q:3 Explain three levels of data abstraction. [winter - 2021(3 marks)]
A: 3
The ANSI SPARC architecture divided into three levels:
1) External level
2) Conceptual level
3) Internal level

Internal Level
∙ This is the lowest level of the data abstraction.
∙ It describes how the data are actually stored on storage devices.
∙ It is also known as a physical level.
∙ The internal view is described by internal schema.
∙ Internal schema consists of definition of stored record, method of representing the data
field and access method used.
Conceptual Level
∙ This is the next higher level of the data abstraction.
∙ It describes what data are stored in the database and what relationships exist among
those data.
∙ It is also known as a logical level.
∙ Conceptual view is defined by conceptual schema. It describes all records and
relationship.
External Level
∙ This is the highest level of data abstraction.
∙ It is also known as view level.
∙ It describes only part of the entire database that a particular end user requires. ∙
External view is describes by external schema.
∙ External schema consists of definition of logical records, relationship in the external
view and method of deriving the objects from the conceptual view.
∙ This object includes entities, attributes and relationship.
Q:4 What are the main functions of a database administrator? [summer - 2020(3
marks)]
A:4 Main functions are as follow -
● working with database software to find ways to store, organise and manage data.
● troubleshooting.
● keeping databases up to date.
● helping with database design and development.
● managing database access.
● designing maintenance procedures and putting them into operation.
Q:5 Describe tasks performed by the Database Administrator. [winter - 2020(4
marks)]
A:5
1. Software Installation and Maintenance
A DBA often collaborates on the initial installation and configuration of a new Oracle,
SQL Server etc database. The system administrator sets up hardware and deploys the
operating system for the database server, and then the DBA installs the database
software and configures it for use. As updates and patches are required, the DBA
handles this ongoing maintenance.
And if a new server is needed, the DBA handles the transfer of data from the existing
system to the new platform.
2. Data Extraction, Transformation, and Loading
Known as ETL, data extraction, transformation, and loading refers to efficiently
importing large volumes of data that have been extracted from multiple systems into a
data warehouse environment.
This external data is cleaned up and transformed to fit the desired format so that it can
be imported into a central repository.
3. Specialised Data Handling
Today’s databases can be massive and may contain unstructured data types such as
images, documents, or sound and video files. Managing a very large database (VLDB)
may require higher-level skills and additional monitoring and tuning to maintain
efficiency.
4. Database Backup and Recovery
DBAs create backup and recovery plans and procedures based on industry best
practices, then make sure that the necessary steps are followed. Backups cost time and
money, so the DBA may have to persuade management to take necessary precautions
to preserve data.
System admins or other personnel may actually create the backups, but it is the DBA’s
responsibility to make sure that everything is done on schedule.
5. Security
A DBA needs to know potential weaknesses of the database software and the
company’s overall system and work to minimise risks. No system is one hundred per
cent immune to attacks, but implementing best practices can minimise risks.
In the case of a security breach or irregularity, the DBA can consult audit logs to see
who has done what to the data. Audit trails are also important when working with
regulated data.
6. Authentication
Setting up employee access is an important aspect of database security. DBAs control
who has access and what type of access they are allowed.
7. Capacity Planning
The DBA needs to know how large the database currently is and how fast it is growing
in order to make predictions about future needs. Storage refers to how much room the
database takes up in server and backup space. Capacity refers to usage level.
8. Performance Monitoring
Monitoring databases for performance issues is part of the ongoing system
maintenance a DBA performs. If some part of the system is slowing down processing,
the DBA may need to make configuration changes to the software or add additional
hardware capacity.
9. Database Tuning
Performance monitoring shows where the database should be tweaked to operate as
efficiently as possible. The physical configuration, the way the database is indexed, and
how queries are handled can all have a dramatic effect on database performance.
With effective monitoring, it is possible to proactively tune a system based on
application and usage instead of waiting until a problem develops.
10. Troubleshooting
DBAs are on call for troubleshooting in case of any problems. Whether they need to
quickly restore lost data or correct an issue to minimise damage, a DBA needs to
quickly understand and respond to problems when they occur.
Q:6 Explain the difference between physical and logical data
independence.[summer - 2020(4 marks)]

A: 6
Here is the differences between Physical and Logical Data Independence.
Parameters Physical Data Independence Logical Data Independence

Basics Physical data independence is Logical data independence is


concerned mainly with how a set concerned mainly with the
of data/ info gets stored in a changing definition of the data in a
given system. system or its structure as a whole.

Ease of We can easily retrieve it. Retrieving is very difficult because


Retrieving the data mainly depends on its
logical structure and not its
physical location.

Ease of Achieving physical data Achieving logical data


Achieving independence is much easier as independence is more difficult as
compared to logical data compared to physical data
independence. independence.

Degree of The changes made at the Any changes made at the physical
Changes physical level need not be made level need to be made at the
Required at the application level. application level as well.

Internal We may or may not need the Making modifications at the logical
Modification modifications at the internal level is a prerequisite whether we
level for improving the want to change the database
performance of a system’s structure or not.
structure.

Type of The internal schema is the The conceptual schema is the


Schema primary concern. primary concern.

Examples For example, altering the For example, adding, deleting, or


compression techniques, modifying any attribute in a
storage devices changes, and system.
changes in the hashing
algorithms.
Q:7 Describe Data Definition Language and Data Manipulation Language. [winter -
2021(4 marks)]
What is Data Definition Language? List DDL statements and explain anyone with
an example. [summer - 2021(3 marks)]

A: 7

DDL (Data Definition Language)


● It is a set of SQL commands used to create, modify and delete database objects
such as tables, views, indices, etc.
● It is normally used by DBA and database designers.
● It provides commands like:
CREATE: to create objects in a database.
ALTER: to alter the schema, or logical structure of the database.
DROP: to delete objects from the database.
TRUNCATE: to remove all records from the table.
CREATE: Create is used to create the database or its objects like table, view, index etc.
Create Table
The CREATE TABLE statement is used to create a new table in a database.
Syntax:
CREATE TABLE table_name
(
Column1 Datatype(Size) [ NULL | NOT NULL ],
Column2 Datatype(Size) [ NULL | NOT NULL ],
...
);
Example:
CREATE TABLE Students
(
Roll_No int(3) NOT NULL,
Name varchar(20),
Subject varchar(20)
);
Explanation:
● The column should either be defined as NULL or NOT NULL. By default, a
column can hold NULL values.
● The NOT NULL constraint enforces a column to NOT accept NULL values. This
enforces a field to always contain a value, which means that you cannot insert a
new record, or update a record without adding a value to this field.
ALTER: ALTER TABLE statement is used to add, modify, or drop columns in a table.
Add Column
The ALTER TABLE statement in SQL to add new columns in a table.
Syntax:
ALTER TABLE table_name
ADD Column1 Datatype(Size), Column2 Datatype(Size), ... ;
Example:
ALTER TABLE Students
ADD Marks int;
Drop Column
The ALTER TABLE statement in SQL to drop a column in a table.
Syntax:
ALTER TABLE table_name
DROP COLUMN column_name;
Example:
ALTER TABLE Students
DROP COLUMN Subject;
Modify Column
The ALTER TABLE statement in SQL to change the data type/size of a column in a
table.
Syntax:
ALTER TABLE table_name
ALTER COLUMN column_name datatype(size);
Example:
ALTER TABLE Students
ALTER COLUMN Roll_No float;
DROP: Drop is used to drop the database or its objects like table, view, index etc.
Drop Table
The DROP TABLE statement is used to drop an existing table in a database.
Syntax:
DROP TABLE table_name;
Example:
DROP TABLE Students;
TRUNCATE: Truncate is used to remove all records from the table.
Syntax:
TRUNCATE TABLE table_name;
Example:
TRUNCATE TABLE Students;
DML (Data Manipulation Language)
● It is a set of SQL commands used to insert, modify and delete data in a
database.
● It is normally used by general users who are accessing database via
pre-developed
● applications.
● It provides commands like:
INSERT: to insert data into a table.
UPDATE: to modify existing data in a table.
DELETE: to delete records from a table.
INSERT: The INSERT STATEMENT is used to insert data into a table.
Syntax:
INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);
OR
INSERT INTO table_name
VALUES (value1, value2, value3, ...);
Example:
INSERT INTO Students (Roll_No,Name,Subject)
VALUES (1,’anil’,’Maths’);
UPDATE: The UPDATE STATEMENT is used to modify existing data in a table.
Syntax:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
Example:
UPDATE Students
SET Name= ‘Mahesh’
WHERE Roll_No=1;
DELETE: The DELETE STATEMENT is used to delete records from a table.
Syntax:
DELETE FROM table_name
WHERE condition;
Example:
DELETE FROM Students
WHERE Subject=’Maths’;

Q.7 Enlist and explain the advantages of DBMS over traditional file
system.[Winter-2019(7 marks)]
A:7
File System: A File Management system is a DBMS that allows access to single files or
tables at a time. In a File System, data is directly stored in a set of files. It contains flat
files that have no relation to other files (when only one table is stored in a single file,
then this file is known as a flat file).

DBMS: A Database Management System (DBMS) is application software that allows


users to efficiently define, create, maintain and share databases. Defining a database
involves specifying the data types, structures and constraints of the data to be stored in
the database. Creating a database involves storing the data on some storage medium
that is controlled by DBMS. Maintaining a database involves updating the database
whenever required to evolve and reflect changes in the miniworld and also generating
reports for each change. Sharing a database involves allowing multiple users to access
the database. DBMS also serves as an interface between the database and end users
or application programs. It provides control access to the data and ensures that data is
consistent and correct by defining rules on them.
An application program accesses the database by sending queries or requests for data
to the DBMS. A query causes some data to be retrieved from the database.

Advantages of DBMS over File system:


● Data redundancy and inconsistency: Redundancy is the concept of repetition
of data i.e. each data may have more than a single copy. The file system
cannot control the redundancy of data as each user defines and maintains the
needed files for a specific application to run. There may be a possibility that
two users are maintaining the data of the same file for different applications.
Hence changes made by one user do not reflect in files used by second
users, which leads to inconsistency of data. Whereas DBMS controls
redundancy by maintaining a single repository of data that is defined once
and is accessed by many users. As there is no or less redundancy, data
remains consistent.
● Data sharing: The file system does not allow sharing of data or sharing is too
complex. Whereas in DBMS, data can be shared easily due to a centralized
system.
● Data concurrency: Concurrent access to data means more than one user is
accessing the same data at the same time. Anomalies occur when changes
made by one user get lost because of changes made by another user. The
file system does not provide any procedure to stop anomalies. Whereas
DBMS provides a locking system to stop anomalies to occur.
● Data searching: For every search operation performed on the file system, a
different application program has to be written. While DBMS provides inbuilt
searching operations. The user only has to write a small query to retrieve data
from the database.
● Data integrity: There may be cases when some constraints need to be
applied to the data before inserting it into the database. The file system does
not provide any procedure to check these constraints automatically. Whereas
DBMS maintains data integrity by enforcing user-defined constraints on data
by itself.
● System crashing: In some cases, systems might have crashed due to various
reasons. It is a bane in the case of file systems because once the system
crashes, there will be no recovery of the data that’s been lost. A DBMS will
have the recovery manager which retrieves the data making it another
advantage over file systems.
● Data security: A file system provides a password mechanism to protect the
database but how long can the password be protected? No one can
guarantee that. This doesn’t happen in the case of DBMS. DBMS has
specialized features that help provide shielding to its data.
● Backup: It creates a backup subsystem to restore the data if required.
● Interfaces: It provides different multiple user interfaces like graphical user
interface and application program interface.
● Easy Maintenance: It is easily maintainable due to its centralized nature.
Q.8 Explain Instance and Schema in detail.[Winter-2019(3 marks)]

A:8
The overall design of the database is called database schema. Schema will not be
changed frequently. It is the logical structure of a database. It does not show the data in
the database.
The schema is pictorially represented as follows −

Types of Schema
The different types of schemas are as follows −
● Physical schema − It is a database design at the physical level.It is hidden
below the logical schema and can be changed easily without affecting the
application programs.
● Logical schema − It is a database design at the logical level. Programmers
construct applications using logical schema.
● External − It is schema at view level. It is the highest level of a schema
which defines the views for end users.

Generally the Database Management System (DBMS) assists one physical schema,
one logical schema and several sub or external schemas.
Database schema refers to the format and layout of the database in which the data will
be stored. It is the one thing that remains the same throughout unless otherwise
modified. It defines the Instance
Instance or extension or database state is a collection of information that stored in a
database at a particular moment is called an instance of the database. The Database
instance refers to the information stored in the database at a given point of time. Thus,
it is a dynamic value which keeps on changing.
Unit - 2: Data models

Q:1 Differentiate strong entity set and weak entity set. Demonstrate the concept
of both using real-time example using E-R diagram.[winter - 2020(7 marks)]

A:1

Definition A strong entity is A weak entity cannot be used


complete by itself and is independently as it is dependent
not dependent on any on a strong entity type known as
other entity type. its owner entity.

Nature Strong entity is Weak entity is dependent in


independent in nature. nature. That means, it depends
on the strong entity.

Primary key Strong entity has primary Weak entity does not have any
key. primary key. But, it has a partial
discriminator key.

Key attribute Strong entity has a key Weak entity does not have any
attribute. key attribute.

Representation Strong entity can be Weak entity can be represented


represented by using a using a double rectangular box.
single rectangular box.

Participation Strong entity can either Weak entity always have total
have no participation or participation.
total participation.

Relationship The relationship between The relationship between a weak


between two two strong entities is entity and a strong entity is
entities represented by a single represented by a double
diamond. diamond.
Strong Entity

Weak Entity
Q:2 What is integrity constraint? Explain primary key, reference key and check
constraint with SQL syntax. [winter - 2021(7 marks)]
A:2
● Integrity constraints are a set of rules. It is used to maintain the quality of
information.
● Integrity constraints ensure that the data insertion, updating, and other processes
have to be performed in such a way that data integrity is not affected.
● Thus, integrity constraint is used to guard against accidental damage to the
database.
1. Check
This constraint defines a business rule on a column. All the rows in that column
must satisfy this rule.
Limits the data values of variables to a specific set, range, or list of values.
The constraint can be applied for a single column or a group of columns.
E.g. value of SPI should be between 0 to 10.
2. Primary key
This constraint defines a column or combination of columns which uniquely
identifies each row in the table.
Primary key = Unique key + Not null
E.g enrollment no column should have unique value as well as can’t be null.
3. Foreign key
A referential integrity constraint (foreign key) is specified between two tables.
In the referential integrity constraints, if a foreign key column in table 1 refers to
the primary key column of table 2, then every value of the foreign key column in
table 1 must be null or be available in primary key column of table 2.

Q:3 Differentiate single-valued and multi-valued attributes with example. [winter -


2021(3 marks)]
A:3
Q:4 What is weak entity? How the weak entity can be converted to the strong
entity? Show the symbol for representing weak entity. [winter - 2021(4 marks)]
A:4

An entity set that does not have a primary key is called weak entity set. ∙ The existence
of a weak entity set depends on the existence of a strong entity set. ∙ Weak entity set is
indicated by double rectangle.
❖ Weak entity relationship set is indicated by double diamond.
❖ The discriminator (partial key) of a weak entity set is the set of attributes that
distinguishes between all the entities of a weak entity set.
❖ The primary key of a weak entity set is created by combining the primary key of
the strong entity set on which the weak entity set is existence dependent and the
weak entity set’s discriminator.
❖ We underline the discriminator attribute of a weak entity set with a dashed line. ∙
E.g. in below fig. there are two entities loan and payment in which loan is strong
entity set and payment is weak entity set.
❖ Payment entity has payment-no which is discriminator.
❖ Loan entity has loan-no as primary key.
❖ So primary key for payment is (loan-no, payment-no).
Q:5 Why do we require E-R model? Explain the term
‘Generalization’,‘Specialization’ and ‘Aggregation’. [winter - 2021(7 marks)]
A:5
Entity-relationship (ER) model/diagram is a graphical representation of entities and their
relationships to each other with their attributes.

Generalization is a process of generalizing an entity which contains generalized


attributes or properties of generalized entities. The entity that is created will contain the
common features. Generalization is a Bottom up process.

We can have three sub entities as Car, Truck, Motorcycle and these three entities can
be generalized into one general super class as Vehicle.

It is a form of abstraction that specifies two or more entities (sub class) having common
characters that can be generalized into one single entity (super class) at higher level
hiding all the differences.
Specialization
Specialization is a process of identifying subsets of an entity that shares different
characteristics. It breaks an entity into multiple entities from higher level (super class) to
lower level (sub class). The breaking of higher level entity is based on some
distinguishing characteristics of the entities in super class.
It is a top down approach in which we first define the super class and then sub class
and then their attributes and relationships.
Aggregation
Aggregation represents relationship between a whole object and its component. Using
aggregation we can express relationship among relationships. Aggregation shows
‘has-a’ or ‘is-part-of’ relationship between entities where one represents the ‘whole’ and
other ‘part’.
Consider a ternary relationship Works_On between Employee, Branch and Manager.
Now the best way to model this situation is to use aggregation, So, the relationship-set,
Works_On is a higher level entity-set. Such an entity-set is treated in the same manner
as any other entity-set. We can create a binary relationship, Manager, between
Works_On and Manager to represent who manages what tasks.

Q:6 What is the similarity between relational model and E-R model? How the
entity, attributes, primary key and relationship are shown in the relational model.
[winter - 2021(7 marks)]
1. Attribute:

Each column in a Table. Attributes are the properties which define a relation. e.g.,
Student_Rollno, NAME,etc.

2. Entity:

An entity is an object or component of data. An entity is represented as rectangle in an


ER diagram.

For example: In the following ER diagram we have two entities Student and College and
these two entities have many to one relationship as many students study in a single
college. We will read more about relationships later, for now focus on entities.

3. Relationship

Cardinality: Defines the numerical attributes of the relationship between two entities or
entity sets.

A relationship is represented by diamond shape in ER diagram, it shows the relationship


among entities. There are four types of cardinal relationships:
1. One to One
2. One to Many
3. Many to One
4. Many to Many

4. Primary Key:

The primary key is an attribute or a set of attributes that uniquely identify a specific
instance of an entity. Every entity in the data model must have a primary key whose
values uniquely identify instances of the entity.

Q:7 Describe the differences in meaning between the terms relation and relation
schema.[summer - 2020(3 marks)]

A:7 Relation schema: A set of attributes is called a relation schema (or relation
scheme). A relation schema is also known as table schema (or table scheme). A
relation schema can be thought of as the basic information describing a table or
relation. It is the logical definition of a table. Relation schema defines what the name of
the table is. This includes a set of column names, the data types associated with each
column.

Relational schema may also refer to as database schema. A database schema is the
collection of relation schemas for a whole database. Relational or Database schema is
a collection of meta-data. Database schema describes the structure and constraints of
data representing in a particular domain. A Relational schema can be described a
blueprint of a database that outlines the way data is organized into tables. This blueprint
will not contain any type of data. In a relational schema, each tuple is divided into fields
called Domains.

There are different kinds of database schemas:


● Conceptual schema
● Logical schema
● Physical schema

Q:8 Write the following queries in relational algebra:


(1) Find the names of suppliers who supply some red part.
(2) Find the IDs of suppliers who supply some red or green part. [summer -
2020(4 marks)]

A:8

Here are the Query

1) πsname((σcolour=0red0 (Part) ✶ Catalog) ✶ Supplier))

Since there is not subscript under the joins, the joins are natural joins, i.e., the

common attributes are equated

2) πsid(σcolour=0red0 ∨ colour=0green0 (Part) ✶ Catalog)

An equivalent formulation uses the union operator

πsid(σcolour=0red0 (Part) ✶ Catalog ∪ σcolour=0green0 (Part) ✶ Catalog).

The latter version can be refined by pushing the projection through the union:
πsid(σcolour=0red0 (Part) ✶ Catalog) ∪ πsid(σcolour=0green0 (Part) ✶ Catalog).

Q:9 An ER diagram can be viewed as a graph. What do the following mean in


terms of the structure of an enterprise schema?
(1) The graph is disconnected.
(2) The graph is acyclic. [summer - 2020(7 marks)]

A:9

a. If a pair of entity sets are connected by a path in an E-R diagram, the entity sets are
related, though perhaps indirectly. A disconnected graph implies that there are pairs of
entity sets that are unrelated to each other. If we split the graph into connected
components, we have, in effect, a separate database corresponding to each connected
component.

b. As indicated in the answer to the previous part, a path in the graph between a pair of
entity sets indicates a (possibly indirect) relationship between the two entity sets. If
there is a cycle in the graph then every pair of entity sets on the cycle are related to
each other in at least two distinct ways. If the E-R diagram is acyclic then there is a
unique path between every pair of entity sets and, thus, a unique relationship between
every pair of entity sets.
Q: 10 Draw ER diagram for university database consisting four entities Student,
Department, Class and Faculty.

Student has a unique id, the student can enroll for multiple classes and has a
most one major. Faculty must belong to department and faculty can teach
multiple classes. Each class is taught by only faculty. Every student will get grade
for the class he/she has enrolled.

A:10
Q:11 Draw an E-R diagram of following scenario. Make necessary assumptions
and clearly note down the same. We would like to make our College’s manually
operated Library to fully computerized .[winter 2021(7 marks)]

A:11
Q:12 Draw E-R diagram for student management system with the necessary
assumption. [summer - 2021(7 marks)]

A:12

Q: 13 List and explain mapping cardinalities of E-R diagram with example.


[summer - 2021(4 marks)

It is expressed as the number of entities to which another entity can be associated via a
relationship set.

For binary relationship set there are entity set A and B then the mapping cardinality can
be one of the following −
● One-to-one
● One-to-many
● Many-to-one
● Many-to-many

One-to-one relationship

One entity of A is associated with one entity of B.

Example

Given below is an example of the one-to-one relationship in the mapping cardinality.


Here, one department has one head of the department (HOD).

One-to-many relationship

An entity set A is associated with any number of entities in B with a possibility of zero
and entity in B is associated with at most one entity in A.
Example

Given below is an example of the one-to-many relationship in the mapping cardinality.


Here, one department has many faculties.

Many-to-one relationship

An entity set A is associated with at most one entity in B and an entity set in B can be
associated with any number of entities in A with a possibility of zero.
Example

Given below is an example of the many-to-one relationship in the mapping cardinality.


Here, many faculties work in one department.

Many-to-many relationship

Many entities of A are associated with many entities of B.

An entity in A is associated with many entities of B and an entity in B is associated with


many entities of A.

Many to many=many to one + one to many

Example

Given below is an example of the many-to-many relationship in the mapping cardinality.


Here, many employees work on many projects.
Q:14 Construct an E-R diagram for a car insurance company whose customers
own one or more cars each. Each car has associated with it zero to any number
of recorded accidents. Each insurance policy covers one or more cars, and has
one or more premium payments associated with it. Each payment is for a
particular period of time and has an associated due date and the date when the
payment was received. [Winter-2019(7 marks)]

A:14

Q:15 Explain specialization and generalization concepts in ER diagram

with suitable example.[Winter-2019(7 marks)]

A:15 Generalization, Specialization and Aggregation in ER model are used for data
abstraction in which abstraction mechanism is used to hide details of a set of objects.
Generalization –
Generalization is the process of extracting common properties from a set of entities and
create a generalized entity from it. It is a bottom-up approach in which two or more
entities can be generalized to a higher level entity if they have some attributes in
common. For Example, STUDENT and FACULTY can be generalized to a higher level
entity called PERSON as shown in Figure 1. In this case, common attributes like
P_NAME, P_ADD become part of higher entity (PERSON) and specialized attributes
like S_FEE become part of specialized entity (STUDENT).

Specialization –
In specialization, an entity is divided into sub-entities based on their characteristics. It is
a top-down approach where higher level entity is specialized into two or more lower
level entities. For Example, EMPLOYEE entity in an Employee management system
can be specialized into DEVELOPER, TESTER etc. as shown in Figure 2. In this case,
common attributes like E_NAME, E_SAL etc. become part of higher entity
(EMPLOYEE) and specialized attributes like TES_TYPE become part of specialized
entity (TESTER).
Q:16 What do you mean by integrity constraints? Discuss various

integrity constraints.[Winter -2019(3 marks)]

A:16 Integrity constraints are a set of rules. It is used to maintain the quality of
information.

● Integrity constraints ensure that the data insertion, updating, and other processes
have to be performed in such a way that data integrity is not affected.

● Thus, integrity constraint is used to guard against accidental damage to the


database.
● Types of Integrity Constraint
Unit - 3: Relational query languages
Q:1 Write Relational Algebra syntax for the following queries.
Employee(eno,ename,salary,designation)
Customer(cno,cname,address,city)
1) Find out name of employees who are ‘Manager’.
2) Display name of customers.
3) Retrieve Employee records whose salary is less than 20,000. [winter - 2021(3
marks)]
A:1
(1) SELECT ename FROM Employee WHERE designation = ‘Manager’;
(2) SELECT cname FROM Customer
(3) SELECT eno, ename, salary, designation FROM Employee WHERE salary < 20000;
Q:2 Write Relational Algebra syntax for the given queries using the following
database.
Employee(eno,ename,salary,designation)
Customer(cno,cname,address,city)
1) Find out name of employees who are also customers.
2) Find out name of person who are employees but not customers.
3) Display all names who are either employees or customers. [winter - 2021(3
marks)]
A:3
(1) SELECT ename FROM employees e INNER JOIN Customer c ON e.ename =
c.cname;
(3) SELECT ename, cname FROM Employees e INNER JOIN Customer c ON e.ename
= c.cname;
Q:3 List unary relational operators and explain with example.[Winter - 2020(4
marks)]

A:3
All those Operators which operate on a single operand are known as unary operators.
There are various types of unary operators in relational algebra.
Types of Unary operators
Unary operators are of three types

1. Projection Operator
2. Selection Operator
3. Rename Operator

Projection Operator
● Projection Operator (π) displays the columns of a table based on the specified
attributes.
● It just works on Columns
Syntax
π<attribute list>(R)

Selection Operator
● Selection Operator (σ) performs a selection operation.
● It selects those rows or tuples from the table which satisfies the selection
condition.
● It works with rows(tuples)
Syntax
σ<selection_condition>(R)

. Rename Operation (ρ)


● To rename relation the rename operation is used which allows us to rename the
output relation
● .’Rename’ operation is denoted with rho(ρ).
Syntax:
ρ x (R)
Q:4 Consider the following relational database schema consisting of the four
relation schemas:[Winter - 2020 (7 marks)]
passenger ( pid, pname, pgender, pcity)
agency ( aid, aname, acity)
flight (fid, fdate, time, src, dest)
booking (pid, aid, fid, fdate)
Answer the following questions using relational algebra queries.
a. Get the details about all flights from Chennai to New Delhi.
b. Get the complete details of all flights to New Delhi.
c. Find the passenger names for passengers who have bookings on at least
one flight.
A:4

A) Get the details about all flights from Chennai to New Delhi.

σ src = “Chennai” ^ dest = “New Delhi” (flight)


B) Get the complete details of all flights to New Delhi.

σ destination = “New Delhi” (flight)


C) Find the passenger names for passengers who have bookings on at least one
flight.

Π pname (passenger ⨝ booking)


Q: 4 Consider the relation R = {A, B, C, D, E, F, G, H, I, J} and the set of functional
dependencies F={{A, B} → C, A → {D, E}, B → F, F →{G, H}, D →{I, J}} What is the
key for R? Decompose R into 2NF, then 3NF relations. [summer - 2020(7 marks)]

Q: 5 Describe the Cartesian Product operation in relational algebra. [summer -


2021(4 marks)]
On applying CARTESIAN PRODUCT on two relations that is on two sets of tuples, it will
take every tuple one by one from the left set(relation) and will pair it up with all the
tuples in the right set(relation).
So, the CROSS PRODUCT of two relation A(R1, R2, R3, …, Rp) with degree p, and
B(S1, S2, S3, …, Sn) with degree n, is a relation C(R1, R2, R3, …, Rp, S1, S2, S3, …,
Sn) with degree p + n attributes.
CROSS PRODUCT is a binary set operation means, at a time we can apply the
operation on two relations. But the two relations on which we are performing the
operations do not have the same type of tuples, which means Union compatibility (or
Type compatibility) of the two relations is not necessary.
Notation:
A✕S
where A and S are the relations,
the symbol ‘✕’ is used to denote the CROSS PRODUCT operator.
Example:
Consider two relations STUDENT(SNO, FNAME, LNAME) and DETAIL(ROLLNO, AGE)
below:
We can observe that the number of tuples in STUDENT relation is 2, and the number of
tuples in DETAIL is 2. So the number of tuples in the resulting relation on performing
CROSS PRODUCT is 2*2 = 4.
Important points on CARTESIAN PRODUCT(CROSS PRODUCT) Operation:
The cardinality (number of tuples) of resulting relation from a Cross Product operation is
equal to the number of attributes(say m) in the first relation multiplied by the number of
attributes in the second relation(say n).
Cardinality = m*n
The Cross Product of two relation A(R1, R2, R3, …, Rp) with degree p, and B(S1, S2,
S3, …, Sn) with degree n, is a relation C(R1, R2, R3, …, Rp, S1, S2, S3, …, Sn) with
degree p + n attributes.
Degree = p+n
In SQL, CARTESIAN PRODUCT(CROSS PRODUCT) can be applied using CROSS
JOIN.
In general, we don’t use cartesian Product unnecessarily, which means without proper
meaning we don’t use Cartesian Product. Generally, we use Cartesian Product followed
by a Selection operation and comparison on the operators as shown below :
σ A=D (A ✕ B)
The above query gives meaningful results.
And this combination of Select and Cross Product operation is so popular that JOIN
operation is inspired by this combination.
CROSS PRODUCT is a binary set operation means, at a time we can apply the
operation on two relations.

Q: 6 Consider the relational database given below. Give an expression in the


relational algebra to express each of the following queries:
Employee (person-name, street, city),
Works (person-name, company-name, salary),
Company (company-name, city),
Manages (person-name, manager-name)
(1) Find name of all employees.
(2) Find city of employee whose name is ‘jashu’.
(3) Find name and city of all employees who are having salary>50000.
(4) Find total salary of all employees who are working for company ‘HCL’.
[summer - 2021(7 marks)]

A:6

(1) SELECT employee-name from employee;

(2) SELECT city FROM employee WHERE employee-name = ‘jashu’;

(3) SELECT employee.employee-name, employee.street, employee.city FROM


employee, workswhere employee.employee-name=works.employee-name and salary >
50000);

(4) SELECT SUM (salary) FROM Works WHERE company-name = ‘HCL’;

Q.7 List the relational algebra operators. Discuss any two such algebra operator
with suitable example. [Winter- 2019(4 marks)]

A:7
Relational algebra is a procedural query language. It gives a step by step process to
obtain the result of the query. It uses operators to perform queries.

Types of Relational operation

1. Select Operation:

● The select operation selects tuples that satisfy a given predicate.

● It is denoted by sigma (σ).


1. Notation: σ p(r)

Where:

σ is used for selection prediction

r is used for relation

p is used as a propositional logic formula which may use connectors like: AND OR and
NOT. These relational can use as relational operators like =, ≠, ≥, <, >, ≤.

For example: LOAN Relation


BRANCH_NAME LOAN_NO AMOUNT

Downtown L-17 1000

Redwood L-23 2000

Perryride L-15 1500

Downtown L-14 1500

Mianus L-13 500

Roundhill L-11 900

Perryride L-16 1300

Input:

1. σ BRANCH_NAME="perryride" (LOAN)
Output:

BRANCH_NAME LOAN_NO AMOUNT

Perryride L-15 1500

Perryride L-16 1300

2. Project Operation:

● This operation shows the list of those attributes that we wish to appear in the
result. Rest of the attributes are eliminated from the table.

● It is denoted by ∏.
1. Notation: ∏ A1, A2, An (r)

Where

A1, A2, A3 is used as an attribute name of relation r.

Example: CUSTOMER RELATION

NAME STREET CITY

Jones Main Harrison


Smith North Rye

Hays Main Harrison

Curry North Rye

Johnson Alma Brooklyn

Brooks Senator Brooklyn

Input:

1. ∏ NAME, CITY (CUSTOMER)

Output:

NAME CITY

Jones Harrison

Smith Rye

Hays Harrison
Curry Rye

Johnson Brooklyn

Brooks Brooklyn

Q:8 The relational database schema is given below.

employee (person-name, street, city)

works (person-name, company-name, salary)

company (company-name, city)

manages (person-name, manager-name)

Write the relational algebra expressions for the given queries.

1.Find the names of all employees who work for First Bank Corporation.

2.Find the names and cities of residence of all employees who work for First Bank

Corporation.

3.. Find the names, street address, and cities of residence of all employees who

work for First Bank Corporation and earn more than $10,000 per annum.

4. Find the names of all employees in this database who do not work for First

Bank Corporation.[Winter-2019(4 marks)]

A:8
1.Find the names of all employees who work for First Bank Corporation.

Πperson-name (σcompany-name = “First Bank Corporation” (works))

2.Find the names and cities of residence of all employees who work for First Bank

Corporation.

Πperson-name, city (employee (σcompany-name = “First Bank Corporation” (works))

3.. Find the names, street address, and cities of residence of all employees who

work for First Bank Corporation and earn more than $10,000 per annum.

Πperson-name, street, city (σ(company-name = “First Bank Corporation” ∧ salary >

10000) works employee)

4. Find the names of all employees in this database who do not work for First

Bank Corporation.

er. Πperson-name (σcompany-name = “First Bank Corporation” (works)) If people may

not work for any company: Πperson-name(employee) − Πperson-name

(σ(company-name = “First Bank Corporation”) (work


Unit - 4: Relational database design

Q: 1 Differentiate lossy decomposition and lossless decomposition. [winter -


2021(4 marks)]

A:1

Q: 2 What is redundancy? Explain insert, update and delete anomalies in


database with example. [winter - 2021(7 marks)]

A:2

It is possible that the same information may be duplicated in different files. This leads to
data redundancy.
● Data redundancy results in memory wastage.
● For example, consider that some customers have both kinds of accounts - saving
and current. In this case, data about customers such as name, address, e-mail
and contact number will be duplicated in both files, saving accounts file and
current account file.
● In other words, same information will be stored in two different locations (files).
And, it wastes memory.

Type of Anomalies in DBMS

Various types of anomalies can occur in a DB. For instance, redundancy anomalies are
a very significant issue for tests if you’re a student, and for job interviews if you’re
searching for a job. But these can be easily identified and fixed. The following are
actually the ones about which we should be worried:
1. Update
2. Insert
3. Delete

Anomalies in databases can be, thus, divided into three major categories:

Update Anomaly

Employee David has two rows in the table given above since he works in two different
departments. If we want to change David’s address, we must do so in two rows, else
the data would become inconsistent.

If the proper address is updated in one of the departments but not in another, David will
have two different addresses in the database, which is incorrect and leads to
inconsistent data.

Insert Anomaly

If a new worker joins the firm and is currently unassigned to any department, we will be
unable to put the data into the table because the w_dept field does not allow nulls.

Delete Anomaly
If the corporation closes the department F890 at some point in the future, deleting the
rows with w_dept as F890 will also erase the information of employee Mike, who is
solely assigned to this department.

Q:3 Consider the relation scheme R = {E, F, G, H, I, J, K, L, M, M} and the set


offunctional dependencies{{E, F} -> {G}, {F} -> {I, J}, {E, H} -> {K, L}, K -> {M}, L ->
{N} on R.What is the key for R? [winter - 2020(3 marks)]

A:3
using Transitivity on (i) and (ii) we get
{E, F} → {G, I, J} .......(vi) ·

⚈ using pseudo transitivity on (iii) and (vi) we get

{E, F, H} → {G, I, J, K, L} ........(vii)

⚈ using Decomposition on (vii) we get

{E, F, H} → {K} and {E, F, H} → {L}

Combining above with (iv) and (v) respectively

{E, F, H} → {M} and {E, F, H} → {N}

Now, finally performing union of these with (vii) we get:

{E, F, H} → {G, I, J, K, L, M, N} ...........(viii)

Also {E, F, H} → {E, F, H} is trivial, combine this with (viii) using union, we get:

{E, F, H} → {E, F, G, H, I, J, K, L, M, N}

⇒ {E, F, H} → R

So, {E, F, H} is a key of R.

(c) {E, F, H, K, L} is a super key but not a key since it is not minimal (i.e. contains extra

attributes)

& (D) {E} cannot be a key of R clearly.


Q.4 Consider a relation scheme R = (A, B, C, D, E, H) on which the following

functional dependencies hold: {A–>B, BC–> D, E–>C, D–>A}. What are the

candidate keys of R?( Any 1 in case of more than one candidate key) [winter -

2020(4 marks)]

A:4
A → B, BC → D, E → C, D → A
We start form set of all the attributes and reduce them using given functional
dependencies
ABCDEH ABCDEH
ABCEH {BC- → D}
ABEH{E → C} BCDEH {D → A}
AEH {A → B} BEH {E → C}
ABCDEH ACDEH {A → B} ADEH{E → C} DEH{D → A}
So candidate keys are AEH, BEH & DEH .
Q:5 Define the terms: a) Primary Key b) Super Key [winter - 2020(3 marks)]
A:5
A primary key, also called a primary keyword, is a column in a relational database table
that's distinctive for each record. It's a unique identifier, such as a driver's license
number, telephone number with area code or vehicle identification number (VIN). A
relational database must have only one primary key.
Super key is a single key or a group of multiple keys that can uniquely identify tuples in
a table. Super keys can contain redundant attributes that might not be important for
identifying tuples. Candidate keys are a subset of Super keys

Q:7 Define the terms : a) foreign key b) candidate key.[winter - 2021(3 marks)]

A:7

Foreign Key – is a column that creates a relationship between two tables. The purpose
of Foreign keys is to maintain data integrity and allow navigation between two different
instances of an entity.

Candidate Key – is a set of attributes that uniquely identify tuples in a table. Candidate
Key is a super key with no repeated attributes.
Q:8 What is normalization? Explain 2NF. [summer - 2020(3 marks)]

A:8
● Normalization is the process of organizing the data in the database.
● Normalization is used to minimize the redundancy from a relation or set of
relations.
● It is also used to eliminate undesirable characteristics like Insertion, Update, and
Deletion Anomalies.
● Normalization divides the larger table into smaller and links them using
relationships.
● The normal form is used to reduce redundancy from the database table.
Second Normal Form (2NF):
Second Normal Form (2NF) is based on the concept of full functional dependency.
Second Normal Form applies to relations with composite keys, that is, relations with a
primary key composed of two or more attributes. A relation with a single-attribute
primary key is automatically in at least 2NF. A relation that is not in 2NF may suffer from
the update anomalies.
To be in second normal form, a relation must be in first normal form and relation must
not contain any partial dependency. A relation is in 2NF if it has No Partial Dependency,
i.e., no non-prime attribute (attributes which are not part of any candidate key) is
dependent on any proper subset of any candidate key of the table.
Q:9 Compute the closure of the following set F of functional dependencies for
relation schema R = (A, B, C, D, E).
A->BC
CD-> E
B -> D
E -> A
List the candidate keys for R.
A:9
A -> BC, B -> D so A -> D so A -> DC -> E
therefore A -> ABCDE
E -> A, A -> ABCDE, so E -> ABCDE
CD -> E, so CD -> ABCDE
B -> D, BC -> CD, so BC -> ABCDE

Attribute closure:
A -> ABCDE
B -> BD
C -> C
D -> D
E -> ABCDE
AB -> ABCDE
AC -> ABCDE
AD -> ABCDE
AE -> ABCDE
BC -> ABCDE
BD -> BD
BE -> ABCDE
CD -> ABCDE
CE -> ABCDE
DE -> ABCDE
ABC -> ABCDE
ABD -> ABCDE
ABE -> ABCDE
ACD -> ABCDE
ACE -> ABCDE
ADE -> ABCDE
BCD -> ABCDE
BDE -> ABCDE
CDE -> ABCDE
ABCD -> ABCDE
ABCE -> ABCDE
ABDE -> ABCDE
ACDE -> ABCDE
BCDE -> ABCDE

The candidate keys are A, E, CD, and BC

Any combination of attributes that includes those is a superkey.


Q:10 What is normalization? Explain 3NF.[summer - 2020(3 marks)]
A:10
Normalization is a database design technique that reduces data redundancy and
eliminates undesirable characteristics like Insertion, Update and Deletion Anomalies.
Normalization rules divides larger tables into smaller tables and links them using
relationships. The purpose of Normalisation in SQL is to eliminate redundant (repetitive)
data and ensure data is stored logically.
The inventor of the relational model Edgar Codd proposed the theory of normalization of
data with the introduction of the First Normal Form, and he continued to extend theory
with Second and Third Normal Form. Later he joined Raymond F. Boyce to develop the
theory of Boyce-Codd Normal Form.

Below is a 3NF example in SQL database:

We have again divided our tables and created a new table which stores Salutations.

There are no transitive functional dependencies, and hence our table is in 3NF

In Table 3 Salutation ID is primary key, and in Table 1 Salutation ID is foreign to primary


key in Table 3
Now our little example is at a level that cannot further be decomposed to attain higher
normal form types of normalization in DBMS. In fact, it is already in higher normalization
forms. Separate efforts for moving into next levels of normalizing data are normally
needed in complex databases. However, we will be discussing next levels of
normalisation in DBMS in brief in the following.

Q:11 Use the definition of functional dependency to argue that each of


Armstrong’s axioms (reflexivity, augmentation, and transitivity) is sound.[summer
- 2020(7 marks)]
A:11
The term Armstrong axioms refer to the sound and complete set of inference rules or
axioms, introduced by William W. Armstrong, that is used to test the logical implication
of functional dependencies. If F is a set of functional dependencies then the closure of
F, denoted as F^+ , is the set of all functional dependencies logically implied by F.
Armstrong’s Axioms are a set of rules, that when applied repeatedly, generates a
closure of functional dependencies.
Axioms –
Axiom of reflexivity –
If {\displaystyle A} is a set of attributes and {\displaystyle B} is subset of {\displaystyle
A}, then {\displaystyle A} holds {\displaystyle B} . If {\displaystyle B\subseteq A} then
{\displaystyle A\to B} This property is trivial property.
Axiom of augmentation –
If {\displaystyle A\to B} holds and {\displaystyle Y} is attribute set, then {\displaystyle
AY\to BY} also holds. That is adding attributes in dependencies, does not change the
basic dependencies. If {\displaystyle A\to B} , then {\displaystyle AC\to BC} for any
{\displaystyle C} .
Axiom of transitivity –
Same as the transitive rule in algebra, if {\displaystyle A\to B} holds and {\displaystyle
B\to C} holds, then {\displaystyle A\to C} also holds. {\displaystyle A\to B} is called
as {\displaystyle A} functionally that determines {\displaystyle B} . If {\displaystyle
X\to Y} and {\displaystyle Y\to Z} , then {\displaystyle X\to Z}
Q:12 Explain RAID Levels with respect to Data Storage.
A:12
RAID refers to redundancy array of the independent disk. It is a technology which is
used to connect multiple secondary storage devices for increased performance, data
redundancy or both. It gives you the ability to survive one or more drive failure
depending upon the RAID level used.
It consists of an array of disks in which multiple disks are connected to achieve different
goals.

RAID technology

There are 7 levels of RAID schemes. These schemas are as RAID 0, RAID 1, ...., RAID
6.

These levels contain the following characteristics:

● It contains a set of physical disk drives.

● In this technology, the operating system views these separate disks as a single
logical disk.

● In this technology, data is distributed across the physical drives of the array.

● Redundancy disk capacity is used to store parity information.

● In case of disk failure, the parity information can be helped to recover the data.

Q:12 Consider schema R = (A, B, C, G, H, I) and the set F of functional


dependencies

{A → B, A → C, CG → H, CG → I, B → H}. Prove that AG →I Holds.[Winter-2019(4


marks)]

A:12

A → H. Since A → B and B → H hold, we apply the transitivity rule.


CG → HI . Since CG → H and CG → I , the union rule implies that CG → HI.
AG → I .Since A → C and CG → I , the pseudo-transitivity rule implies that AG → I
holds.
Q:13 A college maintains details of its lecturers' subject area skills. These details
comprise: Lecturer Number, Lecturer Name, Lecturer Grade, Department Code,
Department Name, Subject Code, Subject Name, Subject Level. Assume that each
lecturer may teach many subjects but may not belong to more than one
department. Subject Code, Subject Name and Subject Level are repeating fields.
Normalize this data to Third Normal Form.

A:13

UNF [unnormalized form]

1. Lecturer Number (PK), Lecturer Name, Lecturer Grade, Department Code,


Department Name, Subject Code, Subject Name, Subject Level

1NF

1. Lecturer Number (PK), Lecturer Name, Lecturer Grade, Department Code,


Department Name

2. Lecturer Number (FK), Subject Code (PK), Subject Name, Subject Leve

l 2NF

1. Lecturer Number (PK), Lecturer Name, Lecturer Grade, Department Code,


Department Name

2. Lecturer Number (FK), Subject Code (PK) 3. Subject Code (PK), Subject Name,
Subject Level

3NF

1. Lecturer Number (PK), Lecturer Name, Lecturer Grade

2. Department Code (PK), Department Name

3. Lecturer Number (FK), Subject Code (PK)

4. Subject Code (PK), Subject Name, Subject Level


Q.14 Explain various Normal forms up to 3NF. [Winter-2019(3 marks) ]

Here are the most commonly used normal forms:

1. First normal form(1NF)

2. Second normal form(2NF)

3. Third normal form(3NF)

4. Boyce & Codd normal form (BCNF)

First normal form (1NF)

A relation is said to be in 1NF (first normal form), if it doesn’t contain any multi-valued
attribute. In other words you can say that a relation is in 1NF if each attribute contains
only atomic(single) value only.

As per the rule of first normal form, an attribute (column) of a table cannot hold multiple
values. It should hold only atomic values.

Example: Let’s say a company wants to store the names and contact details of its
employees. It creates a table in the database that looks like this:
Two employees (Jon & Lester) have two mobile numbers that caused the Emp_Mobile
field to have multiple values for these two employees.

This table is not in 1NF as the rule says “each attribute of a table must have atomic
(single) values”, the Emp_Mobile values for employees Jon & Lester violates that rule.

To make the table complies with 1NF we need to create separate rows for the each
mobile number in such a way so that none of the attributes contains multiple values.

Second normal form (2NF)

A table is said to be in 2NF if both the following conditions hold:

● Table is in 1NF (First normal form)

● No non-prime attribute is dependent on the proper subset of any candidate key of


table.
An attribute that is not part of any candidate key is known as non-prime attribute.

Example: Let’s say a school wants to store the data of teachers and the subjects they
teach. They create a table Teacher that looks like this: Since a teacher can teach more
than one subjects, the table can have multiple rows for a same teacher.

Candidate Keys: {Teacher_Id, Subject}

Non prime attribute: Teacher_Age

This table is in 1 NF because each attribute has atomic values. However, it is not in 2NF
because non prime attribute Teacher_Age is dependent on Teacher_Id alone which is a
proper subset of candidate key. This violates the rule for 2NF as the rule says “no
non-prime attribute is dependent on the proper subset of any candidate key of the
table”.

To make the table complies with 2NF we can disintegrate it in two tables like this:
Teacher_Details table:
Third Normal form (3NF)

A table design is said to be in 3NF if both the following conditions hold:

● Table must be in 2NF

● Transitive functional dependency of non-prime attribute on any super key should


be removed.

An attribute that is not part of any candidate key is known as non-prime attribute.

In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for
each functional dependency X-> Y at least one of the following conditions hold:

● X is a super key of table

● Y is a prime attribute of table

An attribute that is a part of one of the candidate keys is known as prime attribute.

Example: Let’s say a company wants to store the complete address of each employee,
they create a table named Employee_Details that looks like this:

Super keys: {Emp_Id}, {Emp_Id, Emp_Name}, {Emp_Id, Emp_Name,


Emp_Zip}…so on

Candidate Keys: {Emp_Id}


Non-prime attributes: all attributes except Emp_Id are non-prime as they are not part of
any candidate keys.

Here, Emp_State, Emp_City & Emp_District dependent on Emp_Zip. Further Emp_zip


is dependent on Emp_Id that makes non-prime attributes (Emp_State, Emp_City &
Emp_District) transitively dependent on super key (Emp_Id). This violates the rule of
3NF.

To make this table complies with 3NF we have to disintegrate the table into two tables to
remove the transitive dependency:
Unit - 5: Query processing and optimization

Q:1 What is Query processing? Explain why ‘Parsing and translation’ and
‘Optimization’ steps are required for query processing. [winter - 2021(7 marks)]

❖ Write short note on query processing.[Winter- 2020(7 marks)]


❖ Explain typical query processing strategy of DBMS? [summer - 2020(4
marks)]
❖ Explain various steps involved in query processing with example. [summer
- 2021(7 marks)]

A:1

Query Processing is the activity performed in extracting data from the database. In
query processing, it takes various steps for fetching the data from the database. The
steps involved are:
● Parsing and translation
● Optimization
● Evaluation

The query processing works in the following way:

Parsing and Translation

As query processing includes certain activities for data retrieval. Initially, the given user
queries get translated in high-level database languages such as SQL. It gets translated
into expressions that can be further used at the physical level of the file system. After
this, the actual evaluation of the queries and a variety of query -optimizing
transformations and takes place. Thus before processing a query, a computer system
needs to translate the query into a human-readable and understandable language.
Consequently, SQL or Structured Query Language is the best suitable choice for
humans. But, it is not perfectly suitable for the internal representation of the query to the
system. Relational algebra is well suited for the internal representation of a query. The
translation process in query processing is similar to the parser of a query. When a user
executes any query, for generating the internal form of the query, the parser in the
system checks the syntax of the query, verifies the name of the relation in the database,
the tuple, and finally the required attribute value. The parser creates a tree of the query,
known as 'parse-tree.' Further, translate it into the form of relational algebra. With this, it
evenly replaces all the use of the views when used in the query.

Thus, we can understand the working of a query processing in the below-described


diagram:

Q:2 List the type of joins in relational algebra. Explain with example. [winte2020(4
marks)]

A:2

Join in DBMS is a binary operation which allows you to combine join product and
selection in one single statement. The goal of creating a join condition is that it helps
you to combine the data from two or more DBMS tables. The tables in DBMS are
associated using the primary key and foreign keys.
Types of Join

There are mainly two types of joins in DBMS:

1. Inner Joins: Theta, Natural, EQUI

2. Outer Join: Left, Right, Full

Inner Join

Inner Join is used to return rows from both tables which satisfy the given condition. It is
the most widely used join operation and can be considered as a default join-type

An Inner join or equijoin is a comparator-based join which uses equality comparisons in


the join-predicate. However, if you use other comparison operators like “>” it can’t be
called equijoin.

Inner Join further divided into three subtypes:

● Theta join
● Natural join
● EQUI join
Theta Join

Theta Join allows you to merge two tables based on the condition represented by
theta. Theta joins work for all comparison operators. It is denoted by symbol θ. The
general case of JOIN operation is called a Theta join.

Syntax: A ⋈θ B

EQUI Join

EQUI Join is done when a Theta join uses only the equivalence condition. EQUI join is
the most difficult operation to implement efficiently in an RDBMS, and one reason why
RDBMS have essential performance problems.

Syntax: A ⋈ A.column 2 = B.column 2 (B)


Natural Join (⋈)

Natural Join does not utilize any of the comparison operators. In this type of join, the
attributes should have the same name and domain. In Natural Join, there should be at
least one common attribute between two relations.

It performs selection forming equality on those attributes which appear in both relations
and eliminates the duplicate attributes.

Syntax:C ⋈ D

Outer Join

An Outer Join doesn’t require each record in the two join tables to have a matching
record. In this type of join, the table retains each record even if no other matching
record exists.

Three types of Outer Joins are:

● Left Outer Join


● Right Outer Join
● Full Outer Join

Left Outer Join (A B)

Left Outer Join returns all the rows from the table on the left even if no matching rows
have been found in the table on the right. When no matching record is found in the table
on the right, NULL is returned.

Syntax: A B
Right Outer Join ( A B)

Right Outer Join returns all the columns from the table on the right even if no matching
rows have been found in the table on the left. Where no matches have been found in
the table on the left, NULL is returned. RIGHT outer JOIN is the opposite of LEFT JOIN.

Syntax: A B

Full Outer Join ( A B)

In a Full Outer Join , all tuples from both relations are included in the result, irrespective
of the matching condition.

Syntax: A B

Q:3 A:3

Query Processing is the activity performed in extracting data from the database.
In query processing, it takes various steps for fetching the data from the database.
The steps involved are:

-Parsing and translation


-Optimization
-Evaluation
The query processing works in the following way:

Parsing and Translation

As query processing includes certain activities for data retrieval. Initially, the given user
queries get translated in high-level database languages such as SQL.

It gets translated into expressions that can be further used at the physical levelof the file
system. After this, the actual evaluation of the queries and a variety of query -optimizing
transformations and takes place. Thus before processing a query, a computer system
needs to translate the query into a human-readable and understandable language.
Consequently, SQL or Structured Query Language is the best suitable choice for
humans. But, it is not perfectly suitable for the internal representation of the query to the
system. Relationalalgebra is well suited for the internal representation of a query. The
translation process in query processing is similar to the parser of a query. When a user
executes any query, for generating the internal form of the query, the parser in
thesystem checks the syntax of the query, verifies the name of the relation in the
database, the tuple, and finally the required attribute value. The parser creates a tree of
the query, known as 'parse-tree.' Further, translate it into the formof relational algebra.
With this, it evenly replaces all the use of the views when used in the query.

Query Evaluation Plan

-In order to fully evaluate a query, the system needs to construct a query evaluation
plan.

-The annotations in the evaluation plan may refer to the algorithms to be used for the
particular index or the specific operations.
-Such relational algebra with annotations is referred to as Evaluation Primitives. The
evaluation primitives carry the instructions needed for the evaluation of the operation.

-Thus, a query evaluation plan defines a sequence of primitive operations used for
evaluating a query. The query evaluation plan is also referred to as the query execution
plan.

Optimization

The cost of the query evaluation can vary for different types of queries. Although the
system is responsible for constructing the evaluation plan, the user does need not to
write their query efficiently.

Usually, a database system generates an efficient query evaluation plan, which


minimizes its cost. This type of task performed by the database system and is known as
Query Optimization.

For optimizing a query, the query optimizer should have an estimated cost analysis of
each operation. It is because the overall operation cost depends on the memory
allocations to several operations, execution costs, and so on.

Q:4 Write short on block nested loop join.[summer - 2020(4 marks)]

A:4
Block Nested Loop Join:

In block nested loop join, for a block of outer relation, all the tuples in that block are
compared with all the tuples of the inner relation, then only the next block of outer
relation is considered. All pairs of tuples which satisfy the condition are added in the
result of the join.
for each block bR in BR do
for each block bs in BS do
for each tuple tR in TR do
for each tuple ts in Ts do
compare (tR, ts) if they satisfies the condition
add them in the result of the join
end
end
end
end
Let’s look at some similar cases as nested loop join,
Case-1: Assume only two blocks of main memory are available to store blocks from R
and S relation.
For each block of relation R, we have to transfer all blocks of relation S and each block
of relation R should be transferred only once.
So, the total block transfers needed = BR+ BR * BS

Case-2: Assume one relation fits entirely in the main memory and there is at least space
for one extra block.
In this case, total block transfers needed are similar to nested loop join.
Block nested loop join algorithm reduces the access cost compared to nested loop join
if the main memory space allocated for join is limited.

Q.5 List the techniques to obtain the query cost. Explain any one.[winter - 2020(7
marks)]

A:6

Though a system can create multiple plans for a query, the chosen method should be
the best of all. It can be done by comparing each possible plan in terms of their
estimated cost. For calculating the net estimated cost of any plan, the cost of each
operation within a plan should be determined and combined to get the net estimated
cost of the query evaluation plan.

The cost estimation of a query evaluation plan is calculated in terms of various


resources that include:

● Number of disk accesses

● Execution time taken by the CPU to execute a query

● Communication costs in distributed or parallel database systems.

To estimate the cost of a query evaluation plan, we use the number of blocks
transferred from the disk, and the number of disks seeks. Suppose the disk has an
average block access time of ts seconds and takes an average of tT seconds to transfer
x data blocks. The block access time is the sum of disk seeks time and rotational
latency. It performs S seeks than the time taken will be b*tT + S*tS seconds. If tT=0.1 ms,
tS =4 ms, the block size is 4 KB, and its transfer rate is 40 MB per second. With this, we
can easily calculate the estimated cost of the given query evaluation plan.

Generally, for estimating the cost, we consider the worst case that could happen. The
users assume that initially, the data is read from the disk only. But there must be a
chance that the information is already present in the main memory. However, the users
usually ignore this effect, and due to this, the actual cost of execution comes out less
than the estimated value.

The response time, i.e., the time required to execute the plan, could be used for
estimating the cost of the query evaluation plan. But due to the following reasons, it
becomes difficult to calculate the response time without actually executing the query
evaluation plan:

● When the query begins its execution, the response time becomes dependent on
the contents stored in the buffer. But this information is difficult to retrieve when
the query is in optimized mode, or it is not available also.

● When a system with multiple disks is present, the response time depends on an
interrogation that in "what way accesses are distributed among the disks?". It is
difficult to estimate without having detailed knowledge of the data layout present
over the disk.

● Consequently, instead of minimizing the response time for any query evaluation
plan, the optimizers finds it better to reduce the total resource consumption of the
query plan. Thus to estimate the cost of a query evaluation plan, it is good to
minimize the resources used for accessing the disk or use of the extra resources.
Unit - 6: Storage strategies

Q:1 Differentiate dynamic hashing and static hashing. [winter - 2021(3 marks)]

Q: 2 What is index in the database? Explain sparse indices and Dense indices
with proper example. [winter - 2021(7 marks)]

❖ What is the role of an index in the database management system? Explain


dense index with example. [summer - 2021(4 marks)]

A:2

● Indexes are special lookup tables that the database search engine can use to
speed up data retrieval.
● A database index is a data structure that improves the speed of data retrieval
operations
● on a database table.
● An index in a database is very similar to an index in the back of a book.
● Indexes are used to retrieve data from the database very fast.

Dense Index

● In dense index, there is an index record for every search key value in the
database.
● This makes searching faster but requires more space to store index records.
● In this, the number of records in the index table is same as the number of records
in the
● main table.
● Index records contain search key value and a pointer to the actual record on the
disk.

Sparse Index

● In sparse index, index records are not created for every search key.
● The index record appears only for a few items in the data file.
● It requires less space, less maintenance overhead for insertion, and deletions but
is slower
● compared to the dense index for locating records.
● To search a record in sparse index we search for a value that is less than or
equal to value
● in index for which we are looking.
● After getting the first record, linear search is performed to retrieve the desired
record.
● In the sparse indexing, as the size of the main table grows, the size of index table
also
● grows.

Q: 3 What is the limitation of index-sequential file? Explain with example how B+


tree overcomes it. [winter - 2021(4 marks)]

● Indexed sequential access file requires unique keys and periodic reorganization.
● Indexed sequential access file takes longer time to search the index for the data
access or retrieval.
● It requires more storage space.
● It is expensive because it requires special software.
● It is less efficient in the use of storage space as compared to other file
organizations.

B+ Tree

B+ Tree uses a tree-like structure to store records in file, as the name implies. It
employs the key indexing idea, in which the primary key is used to sort the records. An
index value is generated for each primary key and mapped to the record. The address
of a record in the file is the index of that record.

What is B+ File Organization in DBMS?

The B+ tree file organization is a very advanced method of an indexed sequential


access mechanism. In File, records are stored in a tree-like structure. It employs a
similar key-index idea, in which the primary key is utilised to sort the records. The index
value is generated for each primary key and mapped to the record.

Unlike a binary search tree (BST), the B+ tree can contain more than two children. All
records are stored solely at the leaf node in this method. The leaf nodes are pointed to
by intermediate nodes. There are no records in them.
The above B+ tree shows this:

● The tree has only one root node, which is number 25.
● There is a node-based intermediary layer. They don’t keep the original record.
The only thing they have are pointers to the leaf node.
● The prior value of the root is stored in the nodes to the left of the root node, while
the next value is stored in the nodes to the right, i.e. 15 and 30.
● Only one leaf node contains only values, namely 10, 12, 17, 20, 24, 27, and 29.
● Because all of the leaf nodes are balanced, finding any record is much easier.
● This method allows you to search any record by following a single path and
accessing it quickly.

Pros of B+ Tree File Organization

● Because all records are stored solely in the leaf nodes and ordered in a
sequential linked list, searching becomes very simple using this method.
● It’s easier and faster to navigate the tree structure.
● The size of the B+ tree is unrestricted; therefore, the number of records and the
structure of the B+ tree can both expand and shrink.
● It is a very balanced tree structure. Here, each insert, update, or deletion has no
effect on the tree’s performance.

Cons of B+ Tree File Organization

● The B+ tree file organization method is very inefficient for the static method.

Q:4 Explain hashing.[summer - 2020(3 marks)]

A:4 Hashing:

Hashing is a popular technique for storing and retrieving data as fast as possible. The
main reason behind using hashing is that it gives optimal results as it performs optimal
searches.
Why to use Hashing?
If you observe carefully, in a balanced binary search tree, if we try to search , insert or
delete any element then the time complexity for the same is O(logn). Now there might
be a situation when our applications want to do the same operations in a faster way i.e.
in a more optimized way and here hashing comes into play. In hashing, all the above
operations can be performed in O(1) i.e. constant time. It is important to understand that
the worst case time complexity for hashing remains O(n) but the average case time
complexity is O(1).
Now let us understand a few basic operations of hashing.
Basic Operations:
HashTable: This operation is used in order to create a new hash table.
Delete: This operation is used in order to delete a particular key-value pair from the
hash table.
Get: This operation is used in order to search a key inside the hash table and return the
value that is associated with that key.
Put: This operation is used in order to insert a new key-value pair inside the hash table.

DeleteHashTable: This operation is used in order to delete the hash table

Q:5 Explain B-trees.[summer - 2020(3 marks)]

A:5

B-tree in DBMS is an m-way tree which self balances itself. Due to their balanced
structure, such trees are frequently used to manage and organise enormous databases
and facilitate searches. In a B-tree, each node can have a maximum of n child nodes. In
DBMS, B-tree is an example of multilevel indexing. Leaf nodes and internal nodes will
both have record references. B Tree is called Balanced stored trees as all the leaf
nodes are at same levels.
Properties of B-Tree

Following are some of the properties of B-tree in DBMS:

● A non-leaf node's number of keys is one less than the number of its children.
● The number of keys in the root ranges from one to (m-1) maximum. Therefore,
root has a minimum of two and a maximum of m children.
● The keys range from min([m/2]-1) to max(m-1) for all nodes (non-leaf nodes)
besides the root. Thus, they can have between m and [m/2] children.
● The level of each leaf node is the same.
Unit - 7: Transaction processing

Q:1 How does two phase locking protocol differ from timestamp based protocol?
Explain timestamp-ordering protocol. [winter - 2021(7 marks)]

A:1

Two-Phase Locking 2PL

This locking protocol divides the execution phase of a transaction into three parts. In the
first part, when the transaction starts executing, it seeks permission for the locks it
requires. The second part is where the transaction acquires all the locks. As soon as the
transaction releases its first lock, the third phase starts. In this phase, the transaction
cannot demand any new locks; it only releases the acquired locks.

Two-phase locking has two phases, one is growing, where all the locks are being
acquired by the transaction; and the second phase is shrinking, where the locks held by
the transaction are being released.

To claim an exclusive (write) lock, a transaction must first acquire a shared (read) lock
and then upgrade it to an exclusive lock.

Strict Two-Phase Locking

The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the first
phase, the transaction continues to execute normally. But in contrast to 2PL, Strict-2PL
does not release a lock after using it. Strict-2PL holds all the locks until the commit point
and releases all the locks at a time.

Strict-2PL does not have cascading abort as 2PL does.

Timestamp Ordering Protocol

The timestamp-ordering protocol ensures serializability among transactions in their


conflicting read and write operations. This is the responsibility of the protocol system
that the conflicting pair of tasks should be executed according to the timestamp values
of the transactions.

The timestamp of transaction Ti is denoted as TS(Ti).

Read time-stamp of data-item X is denoted by R-timestamp(X).

Write time-stamp of data-item X is denoted by W-timestamp(X).

Timestamp ordering protocol works as follows −

If a transaction Ti issues a read(X) operation −

If TS(Ti) < W-timestamp(X)

Operation rejected.

If TS(Ti) >= W-timestamp(X)


Operation executed.

All data-item timestamps updated.

If a transaction Ti issues a write(X) operation −

If TS(Ti) < R-timestamp(X)

Operation rejected.

If TS(Ti) < W-timestamp(X)

Operation rejected and Ti rolled back.

Otherwise, operation executed.

Q:2 Enlist and explain user authorization to modify the database schema. [winter
- 2021(4 marks)]

A:2

Grant

● It is a DCL command.
● It grants permissions to users on database objects.
● It can also be used to assign access rights to users.
● For every user, the permissions need to be specified.
● When the access is decentralized, permission granting is easier.

Syntax:
grant privilege_name on object_name
to {user_name | public | role_name}

Revoke

● It is a DCL command.
● It removes permissions if they are granted to users on database objects.
● It takes away/revokes the rights of the users.
● If access for a user is removed, all specific permissions provided by that user to
others will be removed.
● If decentralized access is used, it would be difficult to remove granted
permissions.

Syntax
revoke privilege_name on object_name
from {user_name | public | role_name}

Q:3 How does ‘partial commit’ state differ from ‘commit’ state of the transaction?
[winter - 2021(3 marks)]

A:3

1. Partially Committed –

After completion of all the read and write operation the changes are made in main
memory or local buffer. If the changes are made permanent on the DataBase then the
state will change to “committed state” and in case of failure it will go to the “failed state”.

2. Committed State –

It is the state when the changes are made permanent on the Data Base and the
transaction is complete and therefore terminated in the “terminated state”.

Q:4 What is the atomicity and consistency property of transaction? [winter -


2021(4 marks)]

A:4

In the context of transaction processing, the acronym ACID refers to the four key
properties of a transaction: atomicity, consistency, isolation, and durability.

Atomicity

All changes to data are performed as if they are a single operation. That is, all the
changes are performed, or none of them are.
For example, in an application that transfers funds from one account to another, the
atomicity property ensures that, if a debit is made successfully from one account, the
corresponding credit is made to the other account.

Consistency

Data is in a consistent state when a transaction starts and when it ends.

For example, in an application that transfers funds from one account to another, the
consistency property ensures that the total value of funds in both the accounts is the
same at the start and end of each transaction.

Isolation

The intermediate state of a transaction is invisible to other transactions. As a result,


transactions that run concurrently appear to be serialized.

For example, in an application that transfers funds from one account to another, the
isolation property ensures that another transaction sees the transferred funds in one
account or the other, but not in both, nor in neither.

Durability

After a transaction successfully completes, changes to data persist and are not undone,
even in the event of a system failure.

For example, in an application that transfers funds from one account to another, the
durability property ensures that the changes made to each account will not be reversed.

Q: 5 Explain below mentioned features of concurrency. 1) Improved throughput 2)


Reduced waiting time. [winter - 2021(4 marks)]

A:5

Improved throughput and resource utilization:

● A transaction consists of many steps. Some involve I/O activity; others involve
CPU activity. The CPU and the disks in a computer system can operate in
parallel. Therefore, I/O activity can be done in parallel with processing at the
CPU.
● The parallelism of the CPU and the I/O system can therefore be exploited to run
multiple transactions in parallel.
● While a read or write on behalf of one transaction is in progress on one disk,
another transaction can be running in the CPU, while another disk may be
executing a read or write on behalf of a third transaction.
● All of this increases the throughput of the system—that is, the number of
transactions executed in a given amount of time.
● Correspondingly, the processor and disk utilization also increase; in other words,
the processor and disk spend less time idle, or not performing any useful work.

Reduced waiting time:

● There may be a mix of transactions running on a system, some short and some
long.
● If transactions run serially, a short transaction may have to wait for a preceding
long transaction to complete, which can lead to unpredictable delays in running a
transaction.
● If the transactions are operating on different parts of the database, it is better to
let them run concurrently, sharing the CPU cycles and disk accesses among
them.
● Concurrent execution reduces the unpredictable delays in running transactions.
● Moreover, it also reduces the average response time: the average time for a
transaction to be completed after it has been submitted.

Q:6 What is transaction? What are the functions of commit and rollback?[summer
- 2020(4 marks)]

A:6
We can define a transaction as a group of tasks in DBMS. Here a single task refers to a
minimum processing unit, and we cannot divide it further. Now let us take the example
of a certain simple transaction. Suppose any worker transfers Rs 1000 from X’s account
to Y’s account. This given small and simple transaction involves various low-level tasks.
X’s Account
Open_Account(X)
Old_Bank_Balance = X.balance
New_Bank_Balance = Old_Bank_Balance – 1000
A.balance = New_Bank_Balance
Close_Bank_Account(X)
Y’s Account
Open_Account(Y)
Old_Bank_Balance = Y.balance
New_Bank_Balance = Old_Bank_Balance + 1000
B.balance = New_Bank_Balance
Close_Bank_Account(Y)

1. COMMIT-

COMMIT in SQL is a transaction control language that is used to permanently save the
changes done in the transaction in tables/databases. The database cannot regain its
previous state after its execution of commit.

Example: Consider the following STAFF table with records:

STAFF

sql>
SELECT *
FROM Staff
WHERE Allowance = 400;
sql> COMMIT;

Output:
So, the SELECT statement produced the output consisting of three rows.

2. ROLLBACK

ROLLBACK in SQL is a transactional control language that is used to undo the


transactions that have not been saved in the database. The command is only been
used to undo changes since the last COMMIT.

Example: Consider the following STAFF table with records:

STAFF

sql>
SELECT *
FROM EMPLOYEES
WHERE ALLOWANCE = 400;
sql> ROLLBACK;

Output:

So, the SELECT statement produced the same output with the ROLLBACK command.

Q:7 List and explain ACID properties with respect to Database transaction.
[winter - 2020(3 marks)]
A:7 A transaction is a single logical unit of work that accesses and possibly modifies the
contents of a database. Transactions access data using read and write operations.

In order to maintain consistency in a database, before and after the transaction, certain
properties are followed. These are called ACID properties.

Atomicity:

By this, we mean that either the entire transaction takes place at once or doesn’t
happen at all. There is no midway i.e. transactions do not occur partially. Each
transaction is considered as one unit and either runs to completion or is not executed at
all. It involves the following two operations.

—Abort: If a transaction aborts, changes made to the database are not visible.

—Commit: If a transaction commits, changes made are visible.

Atomicity is also known as the ‘All or nothing rule’.

Consistency:

This means that integrity constraints must be maintained so that the database is
consistent before and after the transaction. It refers to the correctness of a database.

Isolation:

This property ensures that multiple transactions can occur concurrently without leading
to the inconsistency of the database state. Transactions occur independently without
interference. Changes occurring in a particular transaction will not be visible to any
other transaction until that particular change in that transaction is written to memory or
has been committed. This property ensures that the execution of transactions
concurrently will result in a state that is equivalent to a state achieved these were
executed serially in some order.

Durability:

This property ensures that once the transaction has completed execution, the updates
and modifications to the database are stored in and written to disk and they persist even
if a system failure occurs. These updates now become permanent and are stored in
non-volatile memory. The effects of the transaction, thus, are never lost.

Q:8 Explain conflict serializability and view serializability.[summer - 2020(4


marks)]

A:8
Conflict serializability:
Instructions Ii and Ij, of transactions Ti and Tj respectively, conflict if and only if there
exists some item P accessed by both Ii and Ij, and atleast one of these instructions
wrote P.
Consider the below operations-
i. Ii = read(P), Ij = read(P). Ii and Ij don’t conflict.
ii. Ii = read(P), Ij = write(P). They conflict.
iii. Ii = write(P), Ij = read(P). They conflict.
iv. Ii = write(P), Ij = write(P). They conflict.
• A conflict between Ii and Ij forces a temporal order between them.
• If Ii and Ij are consecutive in a schedule and they do not conflict, their results would
remain the same even if they had been interchanged in the schedule.
• If a schedule Scan be transformed in to a schedule S by a series of swaps of
non-conflicting instructions, then S and S` are conflict equivalent.
• In other words a schedule S is conflict serializable if it is conflict equivalent to a serial
schedule.
• Example of a schedule that is not conflict serializable:

T3 T4

Read(P)

Write(P)

Write(P)

• View serializability:

o S and S` are view equivalent if the following three conditions are met:

i. For each data item P, if transaction Ti reads the initial value of P in schedule S, then
transaction Ti must, in schedule S`, also read the initial value of P.
ii. For each data item P, if transaction Ti executes read (P)in schedule S, and that value
was produced by transaction Tj, then transaction Ti must in schedule S` also read the
value of P that was produced by transaction Tj.

iii. For each data item P, the transaction that performs the final write(P) operation in
schedule S must perform the final write(P) operation in schedule S`.

o View equivalence is also based purely on reads and writes alone.

o A schedule S is view serializable if it is ie w equivalent to a serial schedule.

o Every conflict serializable schedule is also view serializable.

o Every view serializable schedule which is not conflict serializable has blind writes.

T3 T4 T6

Read(P)

Write(P)

Write(P)

Write(P)

Q:9 Explain the concept of Conflict Serializable with suitable schedules.[winter -


2020(3 marks)]

A:9 A schedule is called conflict serializability if after swapping of non-conflicting


operations, it can transform into a serial schedule.

● The schedule will be a conflict serializable if it is conflict equivalent to a serial


schedule.
Conflicting Operations

The two operations become conflicting if all conditions satisfy:

1. Both belong to separate transactions.

2. They have the same data item.

3. They contain at least one write operation.

Example:

Swapping is possible only if S1 and S2 are logically equal.

Here, S1 = S2. That means it is non-conflict.


Here, S1 ≠ S2. That means it is conflict.

Conflict Equivalent

In the conflict equivalent, one can be transformed to another by swapping


non-conflicting operations. In the given example, S2 is conflict equivalent to S1 (S1 can
be converted to S2 by swapping non-conflicting operations).

Two schedules are said to be conflict equivalent if and only if:

1. They contain the same set of the transaction.

2. If each pair of conflict operations are ordered in the same way.


Example:

Schedule S2 is a serial schedule because, in this, all operations of T1 are performed


before starting any operation of T2. Schedule S1 can be transformed into a serial
schedule by swapping non-conflicting operations of S1.

After swapping of non-conflict operations, the schedule S1 becomes:

T1 T2

Read(A)

Write(A)

Read(B)

Write(B)

Read(A)
Write(A)

Read(B)

Write(B)

Since, S1 is conflict serializable.

Q:10 List and explain types of locks in transactions.[winter - 2021(3 marks)]

A:10 In this type of protocol, any transaction cannot read or write data until it acquires
an appropriate lock on it. There are two types of lock:

1. Shared lock:

● It is also known as a Read-only lock. In a shared lock, the data item can only
read by the transaction.

● It can be shared between the transactions because when the transaction holds a
lock, then it can't update the data on the data item.

2. Exclusive lock:

● In the exclusive lock, the data item can be both reads as well as written by the
transaction.

● This lock is exclusive, and in this lock, multiple transactions do not modify the
same data simultaneously.
Q.11 With neat diagram explain data storage hierarchy so far..[winter - 2020(4
marks)]

A:11

A database system provides an ultimate view of the stored data. However, data in the
form of bits, bytes get stored in different storage devices.

In this section, we will take an overview of various types of storage devices that are
used for accessing and storing data.

Types of Data Storage

For storing the data, there are different types of storage options available. These
storage types differ from one another as per the speed and accessibility. There are the
following types of storage devices used for storing the data:

● Primary Storage

● Secondary Storage

● Tertiary Storage
Primary Storage

It is the primary area that offers quick access to the stored data. We also know the
primary storage as volatile storage. It is because this type of memory does not
permanently store the data.

Main Memory: It is the one that is responsible for operating the data that is available by
the storage medium. The main memory handles each instruction of a computer
machine. This type of memory can store gigabytes of data on a system but is small
enough to carry the entire database.

Cache: It is one of the costly storage media. On the other hand, it is the fastest one. A
cache is a tiny storage media which is maintained by the computer hardware usually.

Secondary Storage

Secondary storage is also called as Online storage. It is the storage area that allows the
user to save and store data permanently. This type of memory does not lose the data
due to any power failure or system crash. That's why we also call it non-volatile storage.
Flash Memory: A flash memory stores data in USB (Universal Serial Bus) keys which
are further plugged into the USB slots of a computer system. These USB keys help
transfer data to a computer system, but it varies in size limits.

Magnetic Disk Storage: This type of storage media is also known as online storage
media. A magnetic disk is used for storing the data for a long time. It is capable of
storing an entire database. It is the responsibility of the computer system to make
availability of the data from a disk to the main memory for further accessing.

Tertiary Storage

It is the storage type that is external from the computer system. It has the slowest
speed. But it is capable of storing a large amount of data. It is also known as Offline
storage. Tertiary storage is generally used for data backup. There are following tertiary
storage devices available:

● Optical Storage: An optical storage can store megabytes or gigabytes of data. A


Compact Disk (CD) can store 700 megabytes of data with a playtime of around
80 minutes. On the other hand, a Digital Video Disk or a DVD can store 4.7 or
8.5 gigabytes of data on each side of the disk.

● Tape Storage: It is the cheapest storage medium than disks. Generally, tapes
are used for archiving or backing up the data. It provides slow access to data as
it accesses data sequentially from the start. Thus, tape storage is also known as
sequential-access storage.

Q:12 Explain deadlock with suitable scheduling examples.[Winter-2020(7 marks)]

A:12
A deadlock is a condition where two or more transactions are waiting indefinitely for
one another to give up locks. Deadlock is said to be one of the most feared
complications in DBMS as no task ever gets finished and is in waiting state forever.

For example: In the student table, transaction T1 holds a lock on some rows and needs
to update some rows in the grade table. Simultaneously, transaction T2 holds locks on
some rows in the grade table and needs to update the rows in the Student table held by
Transaction T1.

Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock
and similarly, transaction T2 is waiting for T1 to release its lock. All activities come to a
halt state and remain at a standstill. It will remain in a standstill until the DBMS detects
the deadlock and aborts one of the transactions.

Deadlock Avoidance
● When a database is stuck in a deadlock state, then it is better to avoid the
database rather than aborting or restating the database. This is a waste of time
and resource.

● Deadlock avoidance mechanism is used to detect any deadlock situation in


advance. A method like "wait for graph" is used for detecting the deadlock
situation but this method is suitable only for the smaller database. For the larger
database, deadlock prevention method can be used.

Deadlock Detection

In a database, when a transaction waits indefinitely to obtain a lock, then the DBMS
should detect whether the transaction is involved in a deadlock or not. The lock
manager maintains a Wait for the graph to detect the deadlock cycle in the database.

Wait for Graph

● This is the suitable method for deadlock detection. In this method, a graph is
created based on the transaction and their lock. If the created graph has a cycle
or closed loop, then there is a deadlock.

● The wait for the graph is maintained by the system for every transaction which is
waiting for some data held by the others. The system keeps checking the graph if
there is any cycle in the graph.

The wait for a graph for the above scenario is shown below:
Deadlock Prevention

● Deadlock prevention method is suitable for a large database. If the resources are
allocated in such a way that deadlock never occurs, then the deadlock can be
prevented.

● The Database management system analyzes the operations of the transaction


whether they can create a deadlock situation or not. If they do, then the DBMS
never allowed that transaction to be executed.

Wait-Die scheme

In this scheme, if a transaction requests for a resource which is already held with a
conflicting lock by another transaction then the DBMS simply checks the timestamp of
both transactions. It allows the older transaction to wait until the resource is available for
execution.

Let's assume there are two transactions Ti and Tj and let TS(T) is a timestamp of any
transaction T. If T2 holds a lock by some other transaction and T1 is requesting for
resources held by T2 then the following actions are performed by DBMS:

1. Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has held some
resource, then Ti is allowed to wait until the data-item is available for execution.
That means if the older transaction is waiting for a resource which is locked by
the younger transaction, then the older transaction is allowed to wait for resource
until it is available.

2. Check if TS(Ti) < TS(Tj) - If Ti is older transaction and has held some resource
and if Tj is waiting for it, then Tj is killed and restarted later with the random delay
but with the same timestamp.

Wound wait scheme

● In wound wait scheme, if the older transaction requests for a resource which is
held by the younger transaction, then older transaction forces younger one to kill
the transaction and release the resource. After the minute delay, the younger
transaction is restarted but with the same timestamp.

● If the older transaction has held a resource which is requested by the Younger
transaction, then the younger transaction is asked to wait until older releases it.

Q: 13 List and describe ACID property of transactions. [summer - 2021(4 marks)]

A:13

ACID properties explained


ACID characteristics can be broken down into four properties: atomicity, consistency,
isolation, and durability.

Atomicity
Atomicity refers to the fact that a transaction succeeds or it fails. It is an all-or-nothing
operation. Despite being composed of multiple steps, those steps are treated as a
single operation or a unit. In the example above, where a system crash stopped the
database mid-transaction, the transaction fails, rolling the database back to the previous
state and re-instating Alice’s money.
Consistency
Consistency refers to the characteristic that requires data updated via transactions to
respect the other constraints or rules within the database systems to keep data in a
consistent state

For example, you set in place SQL triggers or integrity constraints that check personal
balances and prevent an account from withdrawing more money than they have - your
app offers no credit. So if Alice started with $50, she would not be allowed to send 100
dollars to Bob.

Isolation
Modern DBMSs allow users to access data concurrently and in parallel. Isolation is the
characteristic that allows concurrency control so modifications from one transaction are
not affecting operations in another transaction. Two parallel transactions are in reality
isolated and seem to be performed sequentially.

Durability
The last ACID property, durability, refers to the persistence of committed transactions.
Transactions and database modifications are not kept in volatile memory but are saved
to permanent storage, such as disks.

Q:14 Describe various state of transaction. [summer - 2021(3 marks)


A:14
States through which a transaction goes during its lifetime. These are the states which
tell about the current state of the Transaction and also tell how we will further do the
processing in the transactions. These states govern the rules which decide the fate of
the transaction whether it will commit or abort.

1. Active State –
When the instructions of the transaction are running then the transaction is in active
state. If all the ‘read and write’ operations are performed without any error then it
goes to the “partially committed state”; if any instruction fails, it goes to the “failed
state”.
2. Partially Committed –
After completion of all the read and write operation the changes are made in main
memory or local buffer. If the changes are made permanent on the DataBase then
the state will change to “committed state” and in case of failure it will go to the “failed
state”.
3. Failed State –
When any instruction of the transaction fails, it goes to the “failed state” or if failure
occurs in making a permanent change of data on Data Base.
4. Aborted State –
After having any type of failure the transaction goes from “failed state” to “aborted
state” and since in previous states, the changes are only made to local buffer or
main memory and hence these changes are deleted or rolled-back.
5. Committed State –
It is the state when the changes are made permanent on the Data Base and the
transaction is complete and therefore terminated in the “terminated state”.
6. Terminated State –
If there isn’t any roll-back or the transaction comes from the “committed state”, then
the system is consistent and ready for new transaction and the old transaction is
terminated.

Q:15 What is dirty write in the transaction? Explain with example. [summer -
2021(3 marks)]

A:15
Dirty write is that a transaction updates or deletes (overwrites) the uncommitted data
which other transactions insert, update or delete.

Q:16 What is a deadlock in transaction? How to detect deadlock in system?


Explain with example. [summer - 2021(7 marks)]
A:16
In a database, a deadlock is an unwanted situation in which two or more transactions
are waiting indefinitely for one another to give up locks. Deadlock is said to be one of
the most feared complications in DBMS as it brings the whole system to a Halt.
Example – let us understand the concept of Deadlock with an example :
Suppose, Transaction T1 holds a lock on some rows in the Students table and needs to
update some rows in the Grades table. Simultaneously, Transaction T2 holds locks on
those very rows (Which T1 needs to update) in the Grades table but needs to update
the rows in the Student table held by Transaction T1.
Now, the main problem arises. Transaction T1 will wait for transaction T2 to give up the
lock, and similarly, transaction T2 will wait for transaction T1 to give up the lock. As a
consequence, All activity comes to a halt and remains at a standstill forever unless the
DBMS detects the deadlock and aborts one of the transactions.

Deadlock Avoidance –
When a database is stuck in a deadlock, It is always better to avoid the deadlock rather
than restarting or aborting the database. The deadlock avoidance method is suitable for
smaller databases whereas the deadlock prevention method is suitable for larger
databases.
One method of avoiding deadlock is using application-consistent logic. In the
above-given example, Transactions that access Students and Grades should always
access the tables in the same order. In this way, in the scenario described above,
Transaction T1 simply waits for transaction T2 to release the lock on Grades before it
begins. When transaction T2 releases the lock, Transaction T1 can proceed freely.
Another method for avoiding deadlock is to apply both row-level locking mechanism and
READ COMMITTED isolation level. However, It does not guarantee to remove
deadlocks completely.
Deadlock Detection –
When a transaction waits indefinitely to obtain a lock, The database management
system should detect whether the transaction is involved in a deadlock or not.
Wait-for-graph is one of the methods for detecting the deadlock situation. This method is
suitable for smaller databases. In this method, a graph is drawn based on the
transaction and their lock on the resource. If the graph created has a closed-loop or a
cycle, then there is a deadlock.
For the above-mentioned scenario, the Wait-For graph is drawn below
Deadlock prevention –
For a large database, the deadlock prevention method is suitable. A deadlock can be
prevented if the resources are allocated in such a way that deadlock never occurs. The
DBMS analyzes the operations whether they can create a deadlock situation or not, If
they do, that transaction is never allowed to be executed.

Q:17 What is the use of two-phase locking protocol in concurrency control?


Describe the two-phase locking protocol in detail. [summer - 2021(7 marks)]

A:17

2PL locking protocol

Every transaction will lock and unlock the data item in two different phases.

Growing Phase − All the locks are issued in this phase. No locks are released, after all
changes to data-items are committed and then the second phase (shrinking phase)
starts.

Shrinking phase − No locks are issued in this phase, all the changes to data-items are
noted (stored) and then locks are released.

The 2PL locking protocol is represented diagrammatically as follows −


In the growing phase transaction reaches a point where all the locks it may need has
been acquired. This point is called LOCK POINT.

After the lock point has been reached, the transaction enters a shrinking phase.

Types

Two phase locking is of two types −

Strict two phase locking protocol

A transaction can release a shared lock after the lock point, but it cannot release any
exclusive lock until the transaction commits. This protocol creates a cascade less
schedule.

Cascading schedule: In this schedule one transaction is dependent on another


transaction. So if one has to rollback then the other has to rollback.

Rigorous two phase locking protocol

A transaction cannot release any lock either shared or exclusive until it commits.

The 2PL protocol guarantees serializability, but cannot guarantee that deadlock will not
happen.

Example
Let T1 and T2 are two transactions.

T1=A+B and T2=B+A

Here,

Lock-X(B) : Cannot execute Lock-X(B) since B is locked by T2.

Lock-X(A) : Cannot execute Lock-X(A) since A is locked by T1.

In the above situation T1 waits for B and T2 waits for A. The waiting time never ends.
Both the transaction cannot proceed further at least any one releases the lock
voluntarily. This situation is called deadlock.

The wait for graph is as follows −

Wait for graph: It is used in the deadlock detection method, creating a node for each
transaction, creating an edge Ti to Tj, if Ti is waiting to lock an item locked by Tj. A cycle
in WFG indicates a deadlock has occurred. WFG is created at regular intervals.
Q:18 Differentiate shared lock and exclusive lock in lock-based protocol.
[summer - 2021(3 marks)]

A:18
Q:19 Discuss view serializability in transactions. [summer - 2021(4 marks)]

A:19

View serializability is a concept that is used to compute whether schedules are


View-Serializable or not. A schedule is said to be View-Serializable if it is view
equivalent to a Serial Schedule (where no interleaving of transactions is possible).
Why do we need to use View-Serializability ?
There may be some schedules that are not Conflict-Serializable but still gives a
consistent result because the concept of Conflict-Serializability becomes limited when
the Precedence Graph of a schedule contains a loop/cycle. In such a case we cannot
predict whether a schedule would be consistent or inconsistent. As per the concept of
Conflict-Serializability, We can say that a schedule is Conflict-Serializable (means serial
and consistent) if its corresponding precedence graph does not have any loop/cycle.
But, what if a schedule’s precedence graph contains a cycle/loop and is giving
consistent result/accurate result as a conflict serializable schedule is giving?
So, to address such cases we brought the concept of View-Serializability because we
did not want to confine the concept serializability only to Conflict-Serializability.
Example : Understanding View-Serializability first with a Schedule S1 :
The above graph contains cycle/loop which means it is not conflict-serializable
but it does not mean that it cannot be consistent and equivalent to the serial
schedule it may or may not be.
LookSchedule S’1 :
In the above example if we do swapping among some transaction’s operation so
our table will look like –
Now, we see that the precedence graph of the second table does not contain any
cycle/loop, which means it is conflict serializable (equivalent to serial schedule,
consistent) and the final result is coming the same as the first table.
Note : In the above example we understood that if a schedule is Conflict-serializable so
we can easily predict that It would be –
Equivalent to a serial schedule,
Consistent,
And also a View-Serializable.
But what if it is non-conflict serializable (precedence graph contains loop). In this
situation, we cannot predict whether it is consistent and serializable or not. As we look
in the above example, where the precedence graph of the Schedule S1 was giving
consistent result, equivalent to the serializable result of the Schedule S’1, despite
containing cycles/loops. So, to address the limitation of the Conflict-Serializability
concept View-Serializability method came into the picture.
Methods to check View-Serializability of a schedule –
Method-1 :
Two schedules S1 and S2 are said to be view-equivalent if the following conditions are
agreed – Go to Link : Point number 3.
Method-2 :
First of all, check whether the given schedule is Non-Conflict Serializable or
Conflict-Serializable –
If the given schedule is conflict serializable (means its precedence graph does not
contain any loop/cycle), then the given schedule must be a view serializable. Stop and
submit your final answer.
If the given schedule is non-conflict serializable, then it may or may not be view
serializable. We cannot predict it just by using the concept of conflict serializability, So
we need to look at the below cases.
After performing the above steps if you find the provided schedule is non-conflicting you
need to perform the following steps –
Blind write : Performing the Writing operation (updation), without reading operation,
such write operation is known as a blind write.
If no blind write exists, then the schedule must be a non-View-Serializable schedule.
Stop and submit your final answer.
If there exists any blind write, then, in that case, the schedule may or may not be view
serializable. So we need to look at the below cases. Because, if it does not contain any
blind-write, we can surely state that the schedule would not be View-Serializable.
If the above two conditions do not work {means we have tried the above 2 conditions,
then we have come to this step}. Then, draw a precedence graph using those
dependencies. If no cycle/loop exists in the graph, then the schedule would be a
View-Serializable otherwise not.
Q: 20 Differentiate closed hashing and open hashing in DBMS. [summer - 2021(3
marks)]

A:20

Q: 21 What is log-based recovery? List and explain various fields use in log
records for log-based recovery. [summer - 2021(3 marks)]

A:21

Log-based recovery in DBMS provides the ability to maintain or recover data in case of
system failure. DBMS keeps a record of every transaction on some stable storage
device to provide easy access to data when the system fails. A log file will be created
for every operation performed on the database at that point.
An update log record represented as: <Ti, Xj, V1, V2> has these fields:

Transaction identifier: Unique Identifier of the transaction that performed the write
operation.

Data item: Unique identifier of the data item written.

Old value: Value of data item prior to write.

New value: Value of data item after write operation.

Other type of log records are:

<Ti start>: It contains information about when a transaction Ti starts.

<Ti commit>: It contains information about when a transaction Ti commits.

<Ti abort>: It contains information about when a transaction Ti aborts.

Q: 22 What is the schedule in truncation? How to identify that the given schedule
is conflict serializable? Explain with example. [summer - 2021(7 marks)]

A:22

Conflict Serializable Schedule

● A schedule is called conflict serializability if after swapping of non-conflicting


operations, it can transform into a serial schedule.

● The schedule will be a conflict serializable if it is conflict equivalent to a serial


schedule.

Conflicting Operations

The two operations become conflicting if all conditions satisfy:

1. Both belong to separate transactions.


2. They have the same data item.

3. They contain at least one write operation.

Example:

Swapping is possible only if S1 and S2 are logically equal.

Here, S1 = S2. That means it is non-conflict.


Here, S1 ≠ S2. That means it is conflict.

Conflict Equivalent

In the conflict equivalent, one can be transformed to another by swapping


non-conflicting operations. In the given example, S2 is conflict equivalent to S1 (S1 can
be converted to S2 by swapping non-conflicting operations).

Two schedules are said to be conflict equivalent if and only if:

● They contain the same set of the transaction.

● If each pair of conflict operations are ordered in the same way.

Example:

Schedule S2 is a serial schedule because, in this, all operations of T1 are performed


before starting any operation of T2. Schedule S1 can be transformed into a serial
schedule by swapping non-conflicting operations of S1.

After swapping of non-conflict operations, the schedule S1 becomes:


Unit - 8: Database Security

Q:1 Write a short note on SQL injection.[summer - 2020(7 marks)

A:1

SQL injection

An SQL injection, sometimes abbreviated to SQLi, is a type of vulnerability in which an


attacker uses a piece of SQL (structured query language) code to manipulate a
database and gain access to potentially valuable information. It's one of the most
prevalent and threatening types of attack because it can potentially be used against any
web application or website that uses an SQL-based database (which is most of them).

How do SQL injection attacks work?

To understand SQL injection, it’s important to know what structured query language
(SQL) is. SQL is a query language used in programming to access, modify, and delete
data stored in relational databases. Since the vast majority of websites and web
applications rely on SQL databases, an SQL injection attack can have serious
consequences for organizations.

An SQL query is a request sent to a database for some type of activity or function such
as query of data or execution of SQL code to be performed. An example is when login
information is submitted via a web form to allow a user access to a site. Typically, this
type of web form is designed to accept only specific types of data such as a name
and/or password. When that information is added, it's checked against a database, and
if it matches, the user is granted entry. If not, they are denied access.

Potential problems arise because most web forms have no way of stopping additional
information from being entered on the forms. Attackers can exploit this weakness and
use input boxes on the form to send their own requests to the database. This could
potentially allow them to carry out a range of nefarious activities, from stealing sensitive
data to manipulating the information in the database for their own ends.

Because of the prevalence of web sites and servers that use databases, SQL injection
vulnerabilities are one of the oldest and most widespread types of cyber assault.
Several developments in the hacker community have increased the risk of this type of
attack, most notably the advent of tools to detect and exploit SQL injection. Freely
available from open source developers, these tools allow cybercriminals to
automatically perform attacks in only a few minutes by allowing them to access any
table or any column in the database with just a click and attack process.
Symptoms of SQLi

A successful SQL injection attack may show no symptoms at all. However, sometimes
there are outward signs, which include:

● Receiving an excessive number of requests within a short timeframe. For


example, you may see numerous emails from your webpage contact form.
● Ads redirecting to suspicious websites.
● Strange popups and message errors.

Types of SQL injection

Depending on how they gain access to back-end data and the extent of the potential
damage they cause, SQL injections fall into three categories:

● In-band SQLi:

This type of SQLi attack is straightforward for attackers since they use the same
communication channel to launch attacks and gather results. This type of SQLi attack
has two sub-variations:

● Error-based SQLi: The database produces an error message because of the


attacker’s actions. The attacker gathers information about the database
infrastructure based on the data generated by these error messages.
● Union-based SQLi: The attacker uses the UNION SQL operator to obtain the
desired data by fusing multiple select statements in a single HTTP response.
● Inferential SQLi (also known as Blind SQL injection):

This type of SQLi involves attackers using the response and behavioral patterns of the
server after sending data payloads to learn more about its structure. Data doesn’t
transfer from the website database to the attacker, so the attacker doesn’t see
information about the attack in-band (hence the term ‘blind SQLi). Inferential SQLi can
be classified into two sub-types:

● Time-based SQLi: Attackers send a SQL query to the database, making the
database wait for a few seconds before it responds to the query as true or false.
● Boolean SQLi: Attackers send a SQL query to the database, letting the
application respond by generating either a true or false result.
● Out-of-band SQLi:

This type of SQL attack takes place under two scenarios:


When attackers are unable to use the same channel to launch the attack as well as
gather information; or,

When a server is either too slow or unstable to carry out these actions.

● Impact of SQL injection attacks

A successful SQL injection attack can have serious consequences for a business. This
is because an SQL injection attack can:

● Expose sensitive data. Attackers can retrieve data, which risks exposing
sensitive data stored on the SQL server.
● Compromise data integrity. Attackers can alter or delete information from your
system.
● Compromise users’ privacy. Depending on the data stored on the SQL server,
an attack can expose sensitive user information, such as addresses, telephone
numbers, and credit card details.
● Give an attacker admin access to your system. If a database user has
administrative privileges, an attacker can gain access to the system using
malicious code.
● Give an attacker general access to your system. If you use weak SQL
commands to check usernames and passwords, an attacker could gain access to
your system without knowing a user’s credentials. From there, an attacker can
wreak havoc by accessing and manipulating sensitive information.

The cost of an SQL injection attack is not just financial: it can also involve loss of
customer trust and reputational damage, should personal information such as names,
addresses, phone numbers, and credit card details be stolen. Once customer trust is
broken, it can be very difficult to repair.

Q:2 Write a short note on intrusion detection.[summer - 2020(7 marks)]

A:2
An Intrusion Detection System (IDS) is a system that monitors network traffic for
suspicious activity and issues alerts when such activity is discovered. It is a software
application that scans a network or a system for the harmful activity or policy breaching.
Any malicious venture or violation is normally reported either to an administrator or
collected centrally using a security information and event management (SIEM) system.
A SIEM system integrates outputs from multiple sources and uses alarm filtering
techniques to differentiate malicious activity from false alarms.
Although intrusion detection systems monitor networks for potentially malicious activity,
they are also disposed to false alarms. Hence, organizations need to fine-tune their IDS
products when they first install them. It means properly setting up the intrusion detection
systems to recognize what normal traffic on the network looks like as compared to
malicious activity.
Intrusion prevention systems also monitor network packets inbound the system to check
the malicious activities involved in it and at once send the warning notifications.

Classification of Intrusion Detection System:

IDS are classified into 5 types:


Network Intrusion Detection System (NIDS):
Network intrusion detection systems (NIDS) are set up at a planned point within the
network to examine traffic from all devices on the network. It performs an observation of
passing traffic on the entire subnet and matches the traffic that is passed on the subnets
to the collection of known attacks. Once an attack is identified or abnormal behavior is
observed, the alert can be sent to the administrator. An example of a NIDS is installing it
on the subnet where firewalls are located in order to see if someone is trying to crack
the firewall.
Host Intrusion Detection System (HIDS):
Host intrusion detection systems (HIDS) run on independent hosts or devices on the
network. A HIDS monitors the incoming and outgoing packets from the device only and
will alert the administrator if suspicious or malicious activity is detected. It takes a
snapshot of existing system files and compares it with the previous snapshot. If the
analytical system files were edited or deleted, an alert is sent to the administrator to
investigate. An example of HIDS usage can be seen on mission-critical machines,
which are not expected to change their layout.

Protocol-based Intrusion Detection System (PIDS):


Protocol-based intrusion detection system (PIDS) comprises a system or agent that
would consistently resides at the front end of a server, controlling and interpreting the
protocol between a user/device and the server. It is trying to secure the web server by
regularly monitoring the HTTPS protocol stream and accept the related HTTP protocol.
As HTTPS is un-encrypted and before instantly entering its web presentation layer then
this system would need to reside in this interface, between to use the HTTPS.
Application Protocol-based Intrusion Detection System (APIDS):
Application Protocol-based Intrusion Detection System (APIDS) is a system or agent
that generally resides within a group of servers. It identifies the intrusions by monitoring
and interpreting the communication on application-specific protocols. For example, this
would monitor the SQL protocol explicit to the middleware as it transacts with the
database in the web server.

Hybrid Intrusion Detection System :


Hybrid intrusion detection system is made by the combination of two or more
approaches of the intrusion detection system. In the hybrid intrusion detection system,
host agent or system data is combined with network information to develop a complete
view of the network system. Hybrid intrusion detection system is more effective in
comparison to the other intrusion detection system. Prelude is an example of Hybrid
IDS.
Detection Method of IDS:
Signature-based Method:
Signature-based IDS detects the attacks on the basis of the specific patterns such as
number of bytes or number of 1’s or number of 0’s in the network traffic. It also detects
on the basis of the already known malicious instruction sequence that is used by the
malware. The detected patterns in the IDS are known as signatures.
Signature-based IDS can easily detect the attacks whose pattern (signature) already
exists in system but it is quite difficult to detect the new malware attacks as their pattern
(signature) is not known.
Anomaly-based Method:
Anomaly-based IDS was introduced to detect unknown malware attacks as new
malware are developed rapidly. In anomaly-based IDS there is use of machine learning
to create a trustful activity model and anything coming is compared with that model and
it is declared suspicious if it is not found in model. Machine learning-based method has
a better-generalized property in comparison to signature-based IDS as these models
can be trained according to the applications and hardware configurations.
Comparison of IDS with Firewalls:
IDS and firewall both are related to network security but an IDS differs from a firewall as
a firewall looks outwardly for intrusions in order to stop them from happening. Firewalls
restrict access between networks to prevent intrusion and if an attack is from inside the
network it doesn’t signal. An IDS describes a suspected intrusion once it has happened
and then signals an alarm.
Unit - 9: SQL Concepts

Q:1 When Join is used in SQL? Explain Left outer, Right outer and Full outer join
with SQL syntax. [winter - 2021(7 marks)]

A:1

A join is used to query data from multiple tables and returns the combined result from
two or more tables through a condition. The condition in the join clause indicates how
columns are matched between the specified tables.

What is the LEFT JOIN Clause?

The Left Join clause joins two or more tables and returns all rows from the left table and
matched records from the right table or returns null if it does not find any matching
record. It is also known as Left Outer Join. So, Outer is the optional keyword to use with
Left Join.

We can understand it with the following visual representation:

Syntax of LEFT JOIN Clause

The following is the general syntax of LEFT JOIN:


SELECT column_list FROM table_name1
LEFT JOIN table_name2
ON column_name1 = column_name2
WHERE join_condition

The following is the general syntax of LEFT OUTER JOIN:


SELECT column_list FROM table_name1
LEFT OUTER JOIN table_name2
ON column_name1 = column_name2
WHERE join_condition

What is the RIGHT JOIN Clause?

The Right Join clause joins two or more tables and returns all rows from the right-hand
table, and only those results from the other table that fulfilled the specified join
condition. If it finds unmatched records from the left side table, it returns Null value. It is
also known as Right Outer Join. So, Outer is the optional clause to use with Right Join.

We can understand it with the following visual representation.

Syntax of RIGHT JOIN Clause

The following is the general syntax of RIGHT JOIN:


SELECT column_list FROM table_name1
RIGHT JOIN table_name2
ON column_name1 = column_name2
WHERE join_condition

The following is the general syntax of RIGHT OUTER JOIN:


SELECT column_list FROM table_name1
RIGHT OUTER JOIN table_name2
ON column_name1 = column_name2
WHERE join_condition

Q:2 When do we require to use group by clause? How aggregate functions are
used with group by clause? [winter - 2021(4 marks)]

A:2

The GROUP BY clause is used in SQL queries to organize data that have the same
attribute values. Usually, we use it with the SELECT statement. It is always to remember
that we have to place the GROUP BY clause after the WHERE clause. Additionally, it is
paced before the ORDER BY clause.

We can often use this clause in collaboration with aggregate functions like SUM, AVG,
MIN, MAX, and COUNT to produce summary reports from the database. It's important
to remember that the attribute in this clause must appear in the SELECT clause, not
under an aggregate function. If we do so, the query would be incorrect. As a result, the
GROUP BY clause is always used in conjunction with the SELECT clause. The query
for the GROUP BY clause is grouped query, and it returns a single row for each
grouped object.

The following is the syntax to use GROUP BY clause in a SQL statement:


SELECT column_name, function(column_name)
FROM table_name
WHERE condition
GROUP BY column_name;

Let us understand how the GROUP BY clause works with the help of an example. Here
we will demonstrate it with the same table.

Suppose we want to know developer's average salary in a particular state and organize
results in descending order based on the state column. In that case, we would need
both the GROUP BY and ORDER BY command to get the desired result. We can do
this by executing the command as follows:
mysql> SELECT D_state, avg(D_salary) AS salary
FROM developers
GROUP BY D_state
ORDER BY D_state DESC;

Q:3 Why does the trigger require in database? Write SQL syntax for creating
database trigger. [winter - 2021(3 marks)]

A:3

A trigger is a set of SQL statements that reside in system memory with unique names. It
is a specialized category of stored procedure that is called automatically when a
database server event occurs. Each trigger is always associated with a table.

A trigger is called a special procedure because it cannot be called directly like a stored
procedure. The key distinction between the trigger and procedure is that a trigger is
called automatically when a data modification event occurs against a table. A stored
procedure, on the other hand, must be invoked directly.

The following are the main characteristics that distinguish triggers from stored
procedures:

We cannot manually execute/invoked triggers.

Triggers have no chance of receiving parameters.

A transaction cannot be committed or rolled back inside a trigger.

Syntax of Trigger

We can create a trigger in SQL Server by using the CREATE TRIGGER statement as
follows:
CREATE TRIGGER schema.trigger_name
ON table_name
AFTER {INSERT, UPDATE, DELETE}
[NOT FOR REPLICATION]
AS
{SQL_Statements}
Example:
CREATE TABLE Employee
(
Id INT PRIMARY KEY,
Name VARCHAR(45),
Salary INT,
Gender VARCHAR(12),
DepartmentId INT
)

Q: 4 What is the view? How does it different from the table? [winter - 2021(3
marks)]

A:4

● Views are virtual tables that are compiled at runtime.


● The data associated with views are not physically stored in the view, but it is
stored in
● the base tables of the view.
● A view can be made over one or more database tables.
● Generally, we put those columns in view that we need to retrieve/query again and
● again.
● Once you have created the view, you can query view like as table.

TYPES OF VIEW

1. Simple View

2. Complex View

Syntax:
CREATE VIEW view_name
AS
SELECT column1, column2...
FROM table_name
[WHERE condition];

Simple View

● When we create a view on a single table, it is called simple view.


● In a simple view we can delete, update and Insert data and that changes are
applied on
● base table.
● Insert operation are perform on simple view only if we have primary key and all
not null
● fields in the view.

Example:
--Create View
CREATE VIEW EmpSelect
AS
SELECT Eid, Ename, Department
FROM Employee;
--Display View
Select * from EmpSelect;

Complex View

● When we create a view on more than one table, it is called complex view.
● We can only update data in complex view.
● You can't insert data in complex view.
● In particular, complex views can contain: join conditions, a group by clause, an
order by clause etc.

Example:
--Create View
CREATE VIEW Empview
AS
SELECT Employee.Eid, Employee.Ename, ConcactDetails.City
FROM Employee Inner Join ConcactDetails
On Employee.Eid= ConcactDetails.Eid;
--Display View
Select * from Empview;
Q: 5 What is the difference between Join and Sub query? Explain any two built-in
function for following category. 1) Numeric 2) String 3) Date [winter - 2021(7
marks)]

A:5

What are Joins?

A join is a query that combines records from two or more tables. A join will be performed
whenever multiple tables appear in the FROM clause of the query. The select list of the
query can select any columns from any of these tables. If join condition is omitted or
invalid then a Cartesian product is formed. If any two of these tables have a column
name in common, then must qualify these columns throughout the query with table or
table alias names to avoid ambiguity. Most join queries contain at least one join
condition, either in the FROM clause or in the WHERE clause.

what is Subquery?

A Subquery or Inner query or Nested query is a query within SQL query and embedded
within the WHERE clause. A Subquery is a SELECT statement that is embedded in a
clause of another SQL statement. They can be very useful to select rows from a table
with a condition that depends on the data in the same or another table. A Subquery is
used to return data that will be used in the main query as a condition to further restrict
the data to be retrieved. The subquery can be placed in the following SQL clauses they
are WHERE clause, HAVING clause, FROM clause.

1. String Functions :

The string functions of MySQL can manipulate the text string in many ways. Some
commonly used string functions are being discussed below :
2. Numeric Functions :

The number functions are those functions that accept numeric values and after
performing the required operations, return numeric values. Some useful numeric
functions are being discussed below :

S.No. Function Description Example

1 MOD() Returns the SELECT MOD(11,


remainder of one 4) “Modulus” ;
expression by diving
y another expression.

2 POWER()/POW() Returns the value of SELECT


one expression POWER(3, 2)
raised to the power of “Raised” ;
another expression

3 ROUND() Returns numeric SELECT


expression rounded ROUND(15.193, 1)
to an integer. Can be “Round” ;
used to round an
expression to a
number of decimal
points.

4 SIGN() This function returns SELECT SIGN(-15)


sign of a given “Sign” ;
number.
5 SQRT() Returns the SELECT SQRT(26)
non-negative square “Square root” ;
root of numeric
expression.

6 TRUNCATE() Returns numeric DRLRCT


exp1 truncate to exp2 TRUNCATE(15.79, 1)
decimal places. If “Truncate” ;
exp2 is 0, then the
result will have no
decimal point

3. Date Functions :

Date functions operate on values of the DATE datatype.

S.No. Function Description Example

1 CURDATE()/CURRE Returns the current SELECT CURDATE()


NT_DATE()/CURRE date. ;
NT_DATE

2 DATE() Extracts the date part SELECT


of a date or date-time DATE(‘2020-12-31
expression. 01:02:03’) ;

3 MONTH() Returns the month SELECT


from the date passed. MONTH(‘2020-12-31’
);

4 YEAR() Returns the year SELECT


YEAR(‘2020-12-31’) ;

5 NOW() Returns the time at SELECT NOW() ;


which the function
executes.

6 SYSDATE() Returns the current SELECT NOW(),


date and time. SLEEP(2), NOW() ;or
SELECT
SYSDATE(),
SLEEP(2),
SYSDATE() ;
Q:6 Which operator is used for “For All “type of queries? Explain same with
example. [winter - 2021(7 marks)]

A:6 SQL ALL compares a value of the first table with all values of the second table and
returns the row if there is a match with all values.

For example, if we want to find teachers whose age is greater than all students, we can
use

SELECT *

FROM Teachers

WHERE age > ALL (

SELECT age

FROM Students

);

Here, the sub query

SELECT age

FROM Students

returns all the ages from the Students table. And, the condition

WHERE age > ALL (...)

compares the student ages (returned by subquery) with the teacher's age. If the
teacher's age is greater than all student's ages, the corresponding row of the Teachers
table is selected.
EXAMPLE:

Q:7 Define the terms : a) foreign key b) candidate key.[winter - 2021(3 marks)]

A:7

Foreign Key – is a column that creates a relationship between two tables. The purpose
of Foreign keys is to maintain data integrity and allow navigation between two different
instances of an entity.

Candidate Key – is a set of attributes that uniquely identify tuples in a table. Candidate
Key is a super key with no repeated attributes.
Q.8 Explain following SQL commands with syntax and significance.Commit &
Rollback.[winter - 2021 (3 marks)]

A.8

Commit and rollback are the transaction control commands in SQL.

All the commands that are executed consecutively, treated as a single unit of work and
termed as a transaction.

● If you want to save all the commands which are executed in a transaction, then
just after completing the transaction, you have to execute the commit command.
This command will save all the commands which are executed on a table. All
these changes made to the table will be saved to the disk permanently.

● Whenever the commit command is executed in SQL, all the updations which we
have carried on the table will be uploaded to the server, and hence our work will
be saved.

● The rollback command is used to get back to the previous permanent status of
the table, which is saved by the commit command.

Syntax for commit : {a set of SQL statements}; COMMIT;

Syntax for rollback : ROLLBACK;

Q: 9 What are the importance of Primary key and Unique key in database? Explain
with example. [summer - 2021(4 marks)]

A:9
A primary key is a column of table which uniquely identifies each tuple (row) in that
table. Primary key enforces integrity constraints to the table. Only one primary key is
allowed to use in a table. The primary key does not accept the any duplicate and NULL
values. The primary key value in a table changes very rarely so it is chosen with care
where the changes can occur in a seldom manner. A primary key of one table can be
referenced by foreign key of another table.

For better understanding of primary key we take table named as Student table, having
attributes such as Roll_number, Name, Batch, Phone_number, Citizen_ID.

The roll number attribute can never have identical and NULL value, because every
student enrolled in a university can have unique Roll_number, therefore two students
cannot have same Roll_number and each row in a table is uniquely identified with
student’s roll number. So, we can make Roll_number attribute as a primary key in this
case.

Unique key constraints also identifies an individual tuple uniquely in a relation or table.
A table can have more than one unique key unlike primary key. Unique key constraints
can accept only one NULL value for column. Unique constraints are also referenced by
the foreign key of another table. It can be used when someone wants to enforce unique
constraints on a column and a group of columns which is not a primary key.

For better understanding of unique key we take Student table with Roll_number, Name,
Batch, Phone_number, and Citizen_ID attributes.

Roll number attribute is already assigned with the primary key and Citizen_ID can have
unique constraints where each entry in a Citizen_ID column should be unique because
each citizen of a country must have his or her Unique identification number like Aadhaar
Number. But if student is migrated to another country in that case, he or she would not
have any Citizen_ID and the entry could have a NULL value as only one NULL is
allowed in the unique constraint.

Key Differences Between Primary key and Unique key:

-Primary key will not accept NULL values whereas Unique key can accept NULL values.
-A table can have only one primary key whereas there can be multiple unique key on a
table.
-A Clustered index automatically created when a primary key is defined whereas Unique
key generates the non-clustered index.

Q:10 Describe two rules of mandatory access control. [summer - 2021(3 marks)]

A:10
As users navigate through physical and digital systems, they tend to brush up against
resources and assets that they should or should not have access to. This is particularly
true in digital systems where the lateral movement to different storage, application, or
processing areas can lead to dangerous security threats that undermine the entire
infrastructure.

To maintain separation of assets and resources, security administrators use what are
known as “access controls” that define who can access those resources.

Essentially, once a user is authenticated and authorized to enter a system via a user
account or identity, an access control system sets conditions that determine who, when,
where, and sometimes how that user can navigate the system.

While this concept seems simple on its surface, there are several different access
control schemas that help secure resources against unauthorized access:

Rule-Based Access Control


This approach grants permissions to users based on a structured set of rules and
policies. These rules create a “context” from which resource access can be derived.
These rules are laid out in an Access Control List (ACL) attached to an “object” (the
resource, whether it’s processing permissions, data, account access, etc.).

Some common forms of rule-based access include limiting system access to given
times of the day, or locations (for example, limiting access to devices at or near an office
location).
Q:11 Describe Grant and Revoke commands with suitable example. [summer -
2021(4 marks)]

❖ Explain following SQL commands with syntax and significance of Grant &
Revoke.

A:11

DCL commands are used to enforce database security in a multiple user database
environment. Two types of DCL commands are GRANT and REVOKE. Only Database
Administrator's or owner's of the database object can provide/remove privileges on a
database object.

SQL GRANT Command

SQL GRANT is a command used to provide access or privileges on the database


objects to the users.

The Syntax for the GRANT command is:

GRANT privilege_name
ON object_name
TO {user_name |PUBLIC |role_name}
[WITH GRANT OPTION];

SQL REVOKE Command:

The REVOKE command removes user access rights or privileges to the database
objects.

The Syntax for the REVOKE command is:


REVOKE privilege_name
ON object_name
FROM {user_name |PUBLIC |role_name}

Q:12 Explain SQL injection in brief.

The SQL Injection is a code penetration technique that might cause loss to our
database. It is one of the most practiced web hacking techniques to place malicious
code in SQL statements, via webpage input. SQL injection can be used to manipulate
the application's web server by malicious users.

SQL injection generally occurs when we ask a user to input their username/userID.
Instead of a name or ID, the user gives us an SQL statement that we will unknowingly
run on our database. For Example - we create a SELECT statement by adding a
variable "demoUserID" to select a string. The variable will be fetched from user input
(getRequestString).

demoUserI = getrequestString("UserId");

demoSQL = "SELECT * FROM users WHERE UserId =" +demoUserId;

Types of SQL injection attacks

SQL injections can do more harm other than passing the login algorithms. Some of the
SQL injection attacks include:

● Updating, deleting, and inserting the data: An attack can modify the cookies to
poison a web application's database query.
● It is executing commands on the server that can download and install malicious
programs such as Trojans.
● We are exporting valuable data such as credit card details, email, and passwords
to the attacker's remote server.
● Getting user login details: It is the simplest form of SQL injection. Web application
typically accepts user input through a form, and the front end passes the user
input to the back end database for processing.
Q:13 Consider the following relations and write SQL queries for given statements.
Assume suitable constraints.
job(job-id, job-title, minimum-salary, maximum-salary)
employee(emp-no, emp-name, emp-salary,dept-no)
deposit(acc-no, cust-name, branch-name, amount, account-date)
borrow(loan-no, cust-name, branch-name, amount)
department (dept-no, dept-name)
(1) Give name of employees whose employee number is '001'
(2) Give name of depositors whose branch name starts from ‘S’.
(3) Give employee name(s) whose salary is between Rs. 20000 to 30000 and
department name is Finance.
(4) Update the salary of employee by 10% of their salary who is working in the
Finance department. [summer - 2021(7 marks)]

Q:14 TABLE Worker(WORKER_ID INT NOT NULL PRIMARY


KEY,FIRST_NAME CHAR(25), LAST_NAME CHAR(25),SALARY
INT(15),JOINING_DATE DATETIME,DEPARTMENT CHAR(25));
TABLE Bonus(WORKER_REF_ID INT,BONUS_AMOUNT
INT(10),BONUS_DATE DATETIME,FOREIGN KEY
(WORKER_REF_ID),REFERENCES Worker(WORKER_ID));
TABLE Title(WORKER_REF_ID INT,WORKER_TITLE CHAR(25),
AFFECTED_FROM DATETIME,FOREIGN KEY
(WORKER_REF_ID)REFERENCES Worker(WORKER_ID));
Consider above 3 tables ,assume appropriate data and
solve following SQL queries
1. Print details of the Workers who are also Managers.
2. SQL query to clone a new table from another table.
3. Fetch the list of employees with the same salary.
4. Fetch “FIRST_NAME” from Worker table in upper case.[Winter -2020(4 marks)]
A:14
CREATE DATABASE ORG;
SHOW DATABASES;
USE ORG;

CREATE TABLE Worker (


WORKER_ID INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
FIRST_NAME CHAR(25),
LAST_NAME CHAR(25),
SALARY INT(15),
JOINING_DATE DATETIME,
DEPARTMENT CHAR(25)
);

INSERT INTO Worker


(WORKER_ID, FIRST_NAME, LAST_NAME, SALARY, JOINING_DATE,
DEPARTMENT) VALUES
(001, 'Monika', 'Arora', 100000, '14-02-20 09.00.00', 'HR'),
(002, 'Niharika', 'Verma', 80000, '14-06-11 09.00.00', 'Admin'),
(003, 'Vishal', 'Singhal', 300000, '14-02-20 09.00.00', 'HR'),
(004, 'Amitabh', 'Singh', 500000, '14-02-20 09.00.00', 'Admin'),
(005, 'Vivek', 'Bhati', 500000, '14-06-11 09.00.00', 'Admin'),
(006, 'Vipul', 'Diwan', 200000, '14-06-11 09.00.00', 'Account'),
(007, 'Satish', 'Kumar', 75000, '14-01-20 09.00.00', 'Account'),
(008, 'Geetika', 'Chauhan', 90000, '14-04-11 09.00.00', 'Admin');

CREATE TABLE Bonus (


WORKER_REF_ID INT,
BONUS_AMOUNT INT(10),
BONUS_DATE DATETIME,
FOREIGN KEY (WORKER_REF_ID)
REFERENCES Worker(WORKER_ID)
ON DELETE CASCADE
);

INSERT INTO Bonus


(WORKER_REF_ID, BONUS_AMOUNT, BONUS_DATE) VALUES
(001, 5000, '16-02-20'),
(002, 3000, '16-06-11'),
(003, 4000, '16-02-20'),
(001, 4500, '16-02-20'),
(002, 3500, '16-06-11');
CREATE TABLE Title (
WORKER_REF_ID INT,
WORKER_TITLE CHAR(25),
AFFECTED_FROM DATETIME,
FOREIGN KEY (WORKER_REF_ID)
REFERENCES Worker(WORKER_ID)
ON DELETE CASCADE
);

INSERT INTO Title


(WORKER_REF_ID, WORKER_TITLE, AFFECTED_FROM) VALUES
(001, 'Manager', '2016-02-20 00:00:00'),
(002, 'Executive', '2016-06-11 00:00:00'),
(008, 'Executive', '2016-06-11 00:00:00'),
(005, 'Manager', '2016-06-11 00:00:00'),
(004, 'Asst. Manager', '2016-06-11 00:00:00'),
(007, 'Executive', '2016-06-11 00:00:00'),
(006, 'Lead', '2016-06-11 00:00:00'),
(003, 'Lead', '2016-06-11 00:00:00');

1. Print details of the Workers who are also Managers.

SELECT DISTINCT W.FIRST_NAME, T.WORKER_TITLE


FROM Worker W
INNER JOIN Title T
ON W.WORKER_ID = T.WORKER_REF_ID
AND T.WORKER_TITLE in ('Manager');

2. SQL query to clone a new table from another table.

SELECT * INTO WorkerClone FROM Worker;


SELECT * INTO WorkerClone FROM Worker WHERE 1 = 0;
CREATE TABLE WorkerClone LIKE Worker;

3. Fetch the list of employees with the same salary.

Select distinct W.WORKER_ID, W.FIRST_NAME, W.Salary


from Worker W, Worker W1
where W.Salary = W1.Salary
and W.WORKER_ID != W1.WORKER_ID;

4. Fetch “FIRST_NAME” from Worker table in upper case.

Select upper(FIRST_NAME) from Worker;

Q:15 TABLE Worker(WORKER_ID INT NOT NULL PRIMARY


KEY,FIRST_NAME CHAR(25), LAST_NAME CHAR(25),SALARY
INT(15),JOINING_DATE DATETIME,DEPARTMENT CHAR(25));
TABLE Bonus(WORKER_REF_ID INT,BONUS_AMOUNT
INT(10),BONUS_DATE DATETIME,FOREIGN KEY
(WORKER_REF_ID),REFERENCES Worker(WORKER_ID));
TABLE Title(WORKER_REF_ID INT,WORKER_TITLE CHAR(25),
AFFECTED_FROM DATETIME,FOREIGN KEY
(WORKER_REF_ID)REFERENCES Worker(WORKER_ID));
Consider above 3 tables ,assume appropriate data and
solve following SQL queries.
1. Find out unique values of DEPARTMENT from Worker table
2. Print details of the Workers whose SALARY lies between 100000
and 500000.
3. Print details of the Workers who have joined in Feb’2014.
4. Fetch worker names with salaries >= 50000 and <= 100000.[Winter- 2020(4
marks)]

A:15

CREATE DATABASE ORG;


SHOW DATABASES;
USE ORG;
CREATE TABLE Worker (
WORKER_ID INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
FIRST_NAME CHAR(25),
LAST_NAME CHAR(25),
SALARY INT(15),
JOINING_DATE DATETIME,
DEPARTMENT CHAR(25)
);

INSERT INTO Worker


(WORKER_ID, FIRST_NAME, LAST_NAME, SALARY, JOINING_DATE,
DEPARTMENT) VALUES
(001, 'Monika', 'Arora', 100000, '14-02-20 09.00.00', 'HR'),
(002, 'Niharika', 'Verma', 80000, '14-06-11 09.00.00', 'Admin'),
(003, 'Vishal', 'Singhal', 300000, '14-02-20 09.00.00', 'HR'),
(004, 'Amitabh', 'Singh', 500000, '14-02-20 09.00.00', 'Admin'),
(005, 'Vivek', 'Bhati', 500000, '14-06-11 09.00.00', 'Admin'),
(006, 'Vipul', 'Diwan', 200000, '14-06-11 09.00.00', 'Account'),
(007, 'Satish', 'Kumar', 75000, '14-01-20 09.00.00', 'Account'),
(008, 'Geetika', 'Chauhan', 90000, '14-04-11 09.00.00', 'Admin');
CREATE TABLE Bonus (
WORKER_REF_ID INT,
BONUS_AMOUNT INT(10),
BONUS_DATE DATETIME,
FOREIGN KEY (WORKER_REF_ID)
REFERENCES Worker(WORKER_ID)
ON DELETE CASCADE
);

INSERT INTO Bonus


(WORKER_REF_ID, BONUS_AMOUNT, BONUS_DATE) VALUES
(001, 5000, '16-02-20'),
(002, 3000, '16-06-11'),
(003, 4000, '16-02-20'),
(001, 4500, '16-02-20'),
(002, 3500, '16-06-11');
CREATE TABLE Title (
WORKER_REF_ID INT,
WORKER_TITLE CHAR(25),
AFFECTED_FROM DATETIME,
FOREIGN KEY (WORKER_REF_ID)
REFERENCES Worker(WORKER_ID)
ON DELETE CASCADE
);

INSERT INTO Title


(WORKER_REF_ID, WORKER_TITLE, AFFECTED_FROM) VALUES
(001, 'Manager', '2016-02-20 00:00:00'),
(002, 'Executive', '2016-06-11 00:00:00'),
(008, 'Executive', '2016-06-11 00:00:00'),
(005, 'Manager', '2016-06-11 00:00:00'),
(004, 'Asst. Manager', '2016-06-11 00:00:00'),
(007, 'Executive', '2016-06-11 00:00:00'),
(006, 'Lead', '2016-06-11 00:00:00'),
(003, 'Lead', '2016-06-11 00:00:00');
1. Find out unique values of DEPARTMENT from Worker table
Select distinct DEPARTMENT from Worker;
2. Print details of the Workers whose SALARY lies between 100000
and 500000.
Select * from Worker where SALARY between 100000 and 500000;
3. Print details of the Workers who have joined in Feb’2014.
Select * from Worker where year(JOINING_DATE) = 2014 and month(JOINING_DATE)
= 2;
4. Fetch worker names with salaries >= 50000 and <= 100000.
SELECT CONCAT(FIRST_NAME, ' ', LAST_NAME) As Worker_Name, Salary
FROM worker
WHERE WORKER_ID IN
(SELECT WORKER_ID FROM worker
WHERE Salary BETWEEN 50000 AND 100000);
Q.16 Define Primary key, Candidate key and Super key. [Winter- 2019(3 marks)]

A:16

Super Key – A super key is a group of single or multiple keys which identifies rows in a
table.
Primary Key – is a column or group of columns in a table that uniquely identify every
row in that table.
Candidate Key – is a set of attributes that uniquely identify tuples in a table. Candidate
Key is a super key with no repeated attributes.
Unit - 10: PL/SQL Concepts

Q:1 What is trigger? Explain its type with their syntax. [summer - 2020(3 marks)]

A:1

A trigger is a PL/SQL block structure which is triggered (executed) automatically when


DML statements like Insert, Delete, and Update is executed on a table.

There are three types of triggers.

1. Data Definition Language (DDL) triggers

In SQL Server we can create triggers on DDL statements (like CREATE, ALTER and
DROP) and certain system-defined Stored Procedures that does DDL-like operations.

2. Data Manipulation Language (DML) triggers

In SQL Server we can create triggers on DML statements (like INSERT, UPDATE and
DELETE) and Stored Procedures that do DML-like operations. DML Triggers are of two
types.

1. After trigger (using FOR/AFTER CLAUSE)

After trigger (using FOR/AFTER CLAUSE): After triggers are executed after completing
the execution of DML statements.

Example: If you insert a record/row into a table then the trigger related/associated with
the insert event on this table will executed only after inserting the record into that table.

If the record/row insertion fails, SQL Server will not execute the after trigger.

Instead of trigger (using INSTEAD OF CLAUSE)

● Instead of Trigger (using INSTEAD OF CLAUSE) : Instead of trigger are


executed before starts the execution of DML statements.
● An instead of trigger allows us to skip an INSERT, DELETE, or UPDATE
statement to a table and execute other statements defined in the trigger instead.
The actual insert, delete, or update operation does not occur at all.
● Example: If you insert a record/row into a table then the trigger
related/associated with the insert event on this table will be executed before
inserting the record into that table.
● If the record/row insertion fails, SQL Server will execute the Instead of Trigger.

Logon triggers

● This type of trigger is executed against a LOGON event before a user session is
established to the SQL Server.
● We cannot pass parameters into triggers like stored procedure.
● Triggers are normally slow.
● When triggers can be used,
1. Based on change in one table, we want to update other table.
2. Automatically update derived columns whose values change based on other
columns.
3. Logging.
4. Enforce business rules.

Syntax of Trigger
CREATE [OR ALTER] TRIGGER trigger_name
ON table_name
{ FOR | AFTER | INSTEAD OF }
{ [ INSERT ] [ , ] [ UPDATE ] [ , ] [ DELETE ] }
AS
BEGIN
Executable statements
END;

CREATE [OR ALTER] TRIGGER trigger_name:- This clause creates a trigger with the
given name or overwrites an existing trigger.

[ON table_name]:- This clause identifies the name of the table to which the trigger is
related.

[FOR | AFTER | INSTEAD OF]:- This clause indicates at what time the trigger should
be fired. FOR and AFTER are similar.

[INSERT / UPDATE / DELETE]:- This clause determines on which kind of statement the
trigger should be fired. Either on insert or update or delete or combination of any or all.
More than one statement can be used together separated by Comma. The trigger gets
fired at all the specified triggering event.
Example 1

Trigger to display a message when we perform insert operation on student table.

CREATE TRIGGER student_msg

on Student

AFTER INSERT
Q:2 Write a PL/SQL block to print the given number is odd or even.[summer -
2020(4 marks)]

A:2

DECLARE
n1 NUMBER := &num1;
BEGIN
-- test if the number provided by the user is even
IF MOD(n1,2) = 0 THEN
DBMS_OUTPUT.PUT_LINE ('The number. '||n1||
' is even number');
ELSE
DBMS_OUTPUT.PUT_LINE ('The number '||n1||' is odd number.');
END IF;
DBMS_OUTPUT.PUT_LINE ('Done Successfully');
END;
/

OUTPUT:

Enter value for num1: 19


old 2: n1 NUMBER := &num1;
new 2: n1 NUMBER := 19;
The number 19 is odd number.
Done Successfully

Q:3 Consider the following relational schemas:


EMPLOYEE (EMPLOYEE_NAME, STREET, CITY)
WORKS (EMPLOYEE_NAME, COMPANYNAME, SALARY)
COMPANY (COMPANY_NAME, CITY)
Give an expression in SQL for each of queries below::
(1) Specify the table definitions in SQL.
(2) Find the names of all employees who work for first Bank Corporation.
(3) Find the names and company names of all employees sorted in ascending
order of company name and descending order of employee names of that
company.
(4) Change the city of First Bank Corporation to ‘New Delhi’. [summer - 2020(7
marks)]
A:3
1. Find the names of all employees who work for the First Bank Corporation.
Select person_name from Works
Where company_name=’First Bank Corporation’;

Q:4 Explain cursor and its types. [summer - 2020(3 marks)]

❖ What is the use of a cursor in PL/SQL? Explain with example. [summer -


2021(3 marks)]

A:4

When an SQL statement is processed, Oracle creates a memory area known as context
area. A cursor is a pointer to this context area. It contains all information needed for
processing the statement. In PL/SQL, the context area is controlled by Cursor. A cursor
contains information on a select statement and the rows of data accessed by it.

A cursor is used to referred to a program to fetch and process the rows returned by the
SQL statement, one at a time. There are two types of cursors:

1. Implicit Cursors

2. Explicit Cursors

1) PL/SQL Implicit Cursors

The implicit cursors are automatically generated by Oracle while an SQL statement is
executed, if you don't use an explicit cursor for the statement.

These are created by default to process the statements when DML statements like
INSERT, UPDATE, DELETE etc. are executed.
Orcale provides some attributes known as Implicit cursor's attributes to check the status
of DML operations. Some of them are: %FOUND, %NOTFOUND, %ROWCOUNT and
%ISOPEN.

2) PL/SQL Explicit Cursors

The Explicit cursors are defined by the programmers to gain more control over the
context area. These cursors should be defined in the declaration section of the PL/SQL
block. It is created on a SELECT statement which returns more than one row.

Following is the syntax to create an explicit cursor:

Syntax of explicit cursor

Following is the syntax to create an explicit cursor:

CURSOR cursor_name IS select_statement;;

Steps:

1. You must follow these steps while working with an explicit cursor.

2. Declare the cursor to initialize in the memory.

3. Open the cursor to allocate memory.

4. Fetch the cursor to retrieve data.

Close the cursor to release allocated memory.

1) Declare the cursor:

It defines the cursor with a name and the associated SELECT statement.

Syntax for explicit cursor decleration

CURSOR name IS

SELECT statement;

2) Open the cursor:

It is used to allocate memory for the cursor and make it easy to fetch the rows returned
by the SQL statements into it.

Syntax for cursor open:

OPEN cursor_name;

3) Fetch the cursor:


It is used to access one row at a time. You can fetch rows from the above-opened
cursor as follows:

Syntax for cursor fetch:

FETCH cursor_name INTO variable_list;

4) Close the cursor:

It is used to release the allocated memory. The following syntax is used to close the
above-opened cursors.

Syntax for cursor close:

Close cursor_name;

Q:5 Write a PL/SQL block to print the sum of even numbers from 1 to 50. [summer
- 2020(4 marks)]

A:5

Declare

a number;

sum1 number :=0;

Begin

a:=1;

loop

sum1:=sum1+a;

exit when (a=50);

a:=a+1;

end loop;

dbms_output.put_line('Sum between 1 to 50 is '||sum1);

End;
Q:6 Given the following relations
TRAIN (NAME, START, DEST)
TICKET (PNRNO., START, DEST, FARE)
PASSENGER (NAME, ADDRESS, PNRNO.)
Write SQL expressions for the following queries:
Note: Assume NAME of Train is a column of Ticket.
(1) List the names of passengers who are travelling from the start to the
destination station of the train.
(2) List the names of passengers who have a return journey ticket.
(3) Insert a new Shatabti train from Delhi to Bangalore.
(4) Cancel the ticket of Tintin. [summer - 2020(7 marks)]
A: 6
(i) SELECT P.NAME FROM TRAIN T, TICKET I, PASSENGER P
WHERE P.PNRNO = I.PNRNO AND T.NAME = I.NAMEAND T.START = I.START AND
T.DEST = I.DEST
(ii) SELECT NAME FROM PASSENGERWHERE
PNRNO IN (SELECT DISTINCT A.PNRNO
FROM TICKET A, TICKET B WHERE A.PNRNO = B.PNRNO
AND A.START = B.DEST AND A.DEST = B.START)
(iii) INSERT INTO TRAINVALUES(‘Shatabdi’, ‘Delhi’, ‘Banglore’)
(iv) DELETE FROM TICKET
WHERE PNRNO = (SELECT PNRNO FROM PASSENGER
WHERE NAME = ‘Tintin’)

You might also like