Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

The Relational Data Model

Origins of the Relational Model


Developed by British computer scientist E.F. (Ted) Codd of IBM in a seminal paper in 1970 (A Relational Model for Large Shared Data Banks, Communications of the ACM, June 1970) Considered ingenious but impractical in 1970 Conceptually simple Computers lacked power to implement the relational model Today, microcomputers can run sophisticated relational database software

Relational Model Concepts

Domain: A (usually named) set/universe of atomic values, where by "atomic" we mean simply that, from the point of view of the database, each value in the domain is indivisible (i.e., cannot be broken down into component parts). Examples of domains o o o o o SSN: string of digits of length nine Name: string of characters beginning with an upper case letter GPA: a real number between 0.0 and 4.0 Sex: a member of the set { female, male } Dept_Code: a member of the set { CSE, IT, ECE, EEE, MECH, ... }

These are all logical descriptions of domains. For implementation purposes, it is necessary to provide descriptions of domains in terms of concrete data types (or formats) that are provided by the DBMS (such as String, int, boolean), in a manner analogous to how programming languages have intrinsic data types. Attribute: the name of the role played by some value (coming from some domain) in the context of a relational schema. The domain of attribute A is denoted dom(A). Tuple: A tuple is a mapping from attributes to values drawn from the respective domains of those attributes. A tuple is intended to describe some entity (or relationship between entities) in the miniworld.

As an example, a tuple for a PERSON entity might be


{ Name --> "Rama Krishna", Sex --> Male, IQ --> 786 }

Relation: A (named) set of tuples all of the same form (i.e., having the same set of attributes). The term table is a loose synonym.

ER-CSEB

Relational Schema: used for describing (the structure of) a relation. E.g., R(A1, A2, ..., An) says that R is a relation with attributes A1, ... An . The degree of a relation is the number of attributes it has, here n. Example: STUDENT(Name, SSN, Address) One would think that a "complete" relational schema would also specify the domain of each attribute.

Relational Database: A collection of relations, each one consistent with its specified relational schema.

Characteristics of Relations
Ordering of Tuples: A relation is a set of tuples; hence, there is no order associated with them. Ordering of Attributes: A tuple is best viewed as a mapping from its attributes (i.e., the names we give to the roles played by the values comprising the tuple) to the corresponding values. Hence, the order in which the attributes are listed in a table is irrelevant. (Note that, unfortunately, the set theoretic operations in relational algebra (at least how Elmasri& Navathe define them) make implicit use of the order of the attributes. Hence, Elmasri& Navathe view attributes as being arranged as a sequence rather than a set.) Values of Attributes: For a relation to be in First Normal Form, each of its attribute domains must consist of atomic (neither composite nor multi-valued) values. Much of the theory underlying the relational model was based upon this assumption. Interpretation of a Relation: Each relation can be viewed as a predicate and each tuple an assertion that that predicate is satisfied (i.e., has value true) for the combination of values in it. In other words, each tuple represents a fact.

Relational Model Constraints and Relational Database Schemas


Constraints on databases can be categorized as follows: Inherent model-based: Example: no two tuples in a relation can be duplicates (because a relation is a set of tuples) Schema-based: can be expressed using DDL; this kind is the focus of this section. Application-based: are specific to the "business rules" of the miniworld and typically difficult or impossible to express and enforce within the data model. Hence, it is left to application programs to enforce.

ER-CSEB

Elaborating upon schema-based constraints:

Domain Constraints: Each attribute value must be either null (which is really a nonvalue) or drawn from the domain of that attribute.

Key Constraints: A relation is a set of tuples, and each tuple's "identity" is given by
the values of its attributes. Hence, it makes no sense for two tuples in a relation to be identical (because then the two tuples are actually one and the same tuple). That is, no two tuples may have the same combination of values in their attributes. Superkey of a relation is subsets of attributes, for which no two tuples can have the same combination of values. From the fact that no two tuples can be identical, it follows that the set of all attributes of a relation constitutes a superkey of that relation. A key is a minimal superkey, i.e., a superkey such that, if we were to remove any of its attributes, the resulting set of attributes fails to be a superkey. Example: Suppose that we stipulate that a faculty member is uniquely identified by Name and Address and also by Name and Department, but by no single one of the three attributes mentioned. Then { Name, Address, Department } is a (non-minimal) superkey and each of { Name, Address } and { Name, Department } is a key (i.e., minimal superkey). Candidate key: any key (i.e., any minimal superkey) Primary key: a key chosen to act as the means by which to identify tuples in a relation. Typically, one prefers a primary key to be one having as few attributes as possible.

Entity Integrity, Referential Integrity, and Foreign Keys


Entity Integrity Constraint: Entity integrity constraint states that no primary key value can be null. Referential Integrity Constraint: it is specified between two relations and is used to maintain the consistency among tuples of two relations. Referential integrity constraint states that a tuple in one relation that refers to another relation must refer to an existing tuple in that relation. For ex, the attribute DNO of EMPLOYEE gives the department number for which each employee works; hence, its value in every EMPLOYEE tuple must match the DNUMBER value of some tuple in the DEPARTMENT relation.

ER-CSEB

Foreign Key: A set of attributes FK in relation schema R1 is a foreign key of R1 that references relation R2 if it satisfies following two rules 1. The attributes in FK have the same domain(s) as the primary key attributes PK of R2;the attributes FK are said to reference or refer to the relation R2 2. A value of FK in a tuple t1 of the current state r1(R1) either occurs as a value of PK for some tuple t2 in the current state r2(R2) or is null. In the former case ,we have t1[FK]=t2[PK], and we say that the tuple t1 references or refers to the tuple t2. R1 is called referencing relation and R2 is called referenced relation. The conditions for a foreign key, given above, specify a referential integrity constraint between the 2 relation schemas R1 and R2. Semantic Integrity Constraints: application-specific restrictions that are unlikely to be expressible in DDL. Examples:

salary of a supervisee cannot be greater than that of her/his supervisor salary of an employee cannot be lowered

Relational Databases and Relational Database Schemas


A relational database schema is a set of schemas for its relations together with a set of integrity constraints. A relational database state/instance/snapshot is a set of states of its relations such that no integrity constraint is violated.

Update Operations and Dealing with Constraint Violations


For each of the update operations (Insert, Delete, and Update), we consider what kinds of constraint violations may result from applying it and how we might choose to react. Insert:

domain constraint violation: some attribute value is not of correct domain entity integrity violation: key of new tuple is null key constraint violation: key of new tuple is same as existing one referential integrity violation: foreign key of new tuple refers to non-existent tuple

Ways of dealing with it: reject the attempt to insert! Or give user opportunity to try again with different attribute values.

ER-CSEB

Delete:

Referential integrity violation: a tuple referring to the deleted one exists.

Three options for dealing with it:


Reject the deletion Attempt to cascade (or propagate) by deleting any referencing tuples (plus those that reference them, etc., etc.) modify the foreign key attribute values in referencing tuples to null or to some valid value referencing a different tuple

Update:

Key constraint violation: primary key is changed so as to become same as another tuple's referential integrity violation: o foreign key is changed and new one refers to nonexistent tuple o primary key is changed and now other tuples that had referred to this one violate the constraint

ER-CSEB

Relational Algebra
A brief introduction Relational algebra and relational calculus are formal languages associated with the relational model. Informally, relational algebra is a (high-level) procedural language and relational calculus a non-procedural language. However, formally both are equivalent to one another. A language that produces a relation that can be derived using relational calculus is relationally complete. Relational algebra operations work on one or more relations to define another relation without changing the original relations. Both operands and results are relations, so output from one operation can become input to another operation. Allows expressions to be nested, just as in arithmetic. This property is called closure. Relational algebra is the basic set of operations for the relational model These operations enable a user to specify basic retrieval requests (or queries) The result of an operation is a new relation, which may have been formed from one or more input relations o This property makes the algebra closed (all objects in relational algebra are relations) The algebra operations thus produce new relations o These can be further manipulated using operations of the same algebra A sequence of relational algebra operations forms a relational algebra expression o The result of a relational algebra expression is also a relation that represents the result of a database query (or retrieval request)

ER-CSEB

Relational Algebra consists of several groups of operations Unary Relational Operations SELECT (symbol: (sigma)) PROJECT (symbol: (pi)) RENAME (symbol: (rho)) Relational Algebra Operations from Set Theory UNION ( ) INTERSECTION ( ) DIFFERENCE (or MINUS, ) CARTESIAN PRODUCT ( x ) Binary Relational Operations JOIN (several variations of JOIN exist) DIVISION Additional Relational Operations OUTER JOINS, OUTER UNION AGGREGATE FUNCTIONS

Relational Algebra Operations:


Unary Relational Operations: SELECT The SELECT operation (denoted by (sigma)) is used to select a subset of the tuples from a relation based on a selection condition. o The selection condition acts as a filter o Keeps only those tuples that satisfy the qualifying condition o Tuples satisfying the condition are selected whereas the other tuples are discarded (filtered out) Examples: Select the EMPLOYEE tuples whose department number is 4: DNO = 4 (EMPLOYEE) Select the employee tuples whose salary is greater than $30,000: SALARY > 30000 (EMPLOYEE) In general, the select operation is denoted by <selection condition>(R) where the symbol (sigma) is used to denote the select operator the selection condition is a Boolean (conditional) expression specified on the attributes of relation R tuples that make the condition true are selected (appear in the result of the operation) tuples that make the condition false are filtered out (discarded from the result of the operation)

ER-CSEB

SELECT Operation Properties o o The SELECT operation <selection condition>(R) produces a relation S that has the same schema (same attributes) as R SELECT is commutative:

<condition1> (<condition2>(R)) = <condition2> (<condition1> (R))


o

Because of commutative property, a cascade (sequence) of SELECT operations may be applied in any order:

<cond1> (<cond2> (<cond3>(R)) = <cond2> (<cond3> (<cond1>(R)))


o A cascade of SELECT operations may be replaced by a single selection with a conjunction of all the conditions:

<cond1>(<cond2>(<cond3>(R)) = <cond1>AND<cond2>AND<cond3>(R)))
o The number of tuples in the result of a SELECT is less than (or equal to) the number of tuples in the input relation R

Unary Relational Operations: PROJECT o PROJECT Operation is denoted by (pi) o This operation keeps certain columns (attributes) from a relation and discards the other columns. o PROJECT creates a vertical partitioning The list of specified columns (attributes) is kept in each tuple The other attributes in each tuple are discarded o Example: To list each employees first and last name and salary, the following is used: o The general form of the project operation is:

LNAME, FNAME, SALARY (EMPLOYEE)

o (pi) is the symbol used to represent the project operation o <attribute list> is the desired list of attributes from relation R. o The project operation removes any duplicate tuples

<attribute list>(R)

ER-CSEB

PROJECT Operation Properties o The number of tuples in the result of projection <list>(R) is always less or equal to the number of tuples in R If the list of attributes includes a key of R, then the number of tuples in the result of PROJECT is equal to the number of tuples in R o PROJECT is not commutative

<list1> ( <list2> (R) ) = <list1> (R) as long as <list2> contains the attributes in
<list1> Unary Relational Operations: RENAME

o The RENAME operator is denoted by (rho) o In some cases, we may want to rename the attributes of a relation or the relation name or both o Useful when a query requires multiple operations o Necessary in some cases (see JOIN operation later) o The general RENAME operation can be expressed by any of the following forms: o

S (B1, B2, , Bn ) (R) changes both: S (R) changes:


the relation name only to S

the relation name to S, and the column (attribute) names to B1, B2, ..Bn

the column (attribute) names only to B1, B2, ..Bn o For convenience, we also use a shorthand for renaming attributes in an intermediate relation: o If we write: RESULT FNAME, LNAME, SALARY (DEP5_EMPS) RESULT will have the same attribute names as DEP5_EMPS (same attributes as EMPLOYEE) o If we write: RESULT (F, M, L, S, B, A, SX, SAL, SU, DNO) The 10 attributes of DEP5_EMPS are renamed to F, M, L, S, B, A, SX, SAL, SU, DNO, respectively

(B1, B2, , Bn ) (R) changes:

RESULT (F.M.L.S.B,A,SX,SAL,SU, DNO)(DEP5_EMPS)

ER-CSEB

Relational Algebra Operations from Set Theory


UNION Operation o Binary operation, denoted by o The result of R S, is a relation that includes all tuples that are either in R or in S or in both R and S o Duplicate tuples are eliminated o The two operand relations R and S must be type compatible (or UNION compatible) R and S must have same number of attributes Each pair of corresponding attributes must be type compatible (have same or compatible domains Example: To retrieve the social security numbers of all employees who either work in department 5 (RESULT1 below) or directly supervise an employee who works in department 5 (RESULT2 below) We can use the UNION operation as follows:

DEP5_EMPS DNO=5 (EMPLOYEE) RESULT1 SSN (DEP5_EMPS) RESULT2(SSN) SUPERSSN (DEP5_EMPS) RESULT RESULT1 RESULT2
The union operation produces the tuples that are in either RESULT1 or RESULT2 or both o Type Compatibility of operands is required for the binary set operation UNION , (also for INTERSECTION , and SET DIFFERENCE ) o R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) are type compatible if: o they have the same number of attributes, and o the domains of corresponding attributes are type compatible (i.e. dom(Ai)=dom(Bi) for i=1, 2, ..., n). o The resulting relation for R1 R2 (also for R1 R2, or R1R2, see next slides) has the same attribute names as the first operand relation R1 (by convention)

ER-CSEB

10

INTERSECTION Operation o INTERSECTION is denoted by o The result of the operation R S, is a relation that includes all tuples that are in both R and S o The attribute names in the result will be the same as the attribute names in R o The two operand relations R and S must be type compatible DIFFERENCE Operation o o o o SET DIFFERENCE (also called MINUS or EXCEPT) is denoted by The result of R S, is a relation that includes all tuples that are in R but not in S The attribute names in the result will be the same as the attribute names in R The two operand relations R and S must be type compatible

Some properties of UNION, INTERSECT, and DIFFERENCE o Notice that both union and intersection are commutative operations; that is o R S = S R, and R S = S R o Both union and intersection can be treated as n-ary operations applicable to any number of relations as both are associative operations; that is o R (S T) = (R S) T o (R S) T = R (S T) o The minus operation is not commutative; that is, in general o RS SR CARTESIAN PRODUCT Operation o This operation is used to combine tuples from two relations in a combinatorial fashion. o Denoted by R(A1, A2, . . ., An) X S(B1, B2, . . ., Bm) o Result is a relation Q with degree n + m attributes: Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order. o The resulting relation state has one tuple for each combination of tuplesone from R and one from S. o Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then R x S will have nR * nS tuples. o The two operands do NOT have to be "type compatible o Generally, CROSS PRODUCT is not a meaningful operation o Can become meaningful when followed by other operations

ER-CSEB

11

Example (not meaningful): FEMALE_EMPS SEX=F(EMPLOYEE) EMPNAMES FNAME, LNAME, SSN (FEMALE_EMPS)

EMP_DEPENDENTS EMPNAMES X DEPENDENT EMP_DEPENDENTS will contain every combination of EMPNAMES and DEPENDENT whether or not they are actually related o To keep only combinations where the DEPENDENT is related to the EMPLOYEE, we add a SELECT operation as follows Example (meaningful): FEMALE_EMPS EMPNAMES

FNAME, LNAME, SSN (FEMALE_EMPS) EMP_DEPENDENTS EMPNAMES X DEPENDENT ACTUAL_DEPS SSN=ESSN (EMP_DEPENDENTS) RESULT FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPS)
RESULT will now contain the name of female employees and their dependents JOIN Operation o The sequence of CARTESIAN PRODECT followed by SELECT is used quite commonly to identify and select related tuples from two relations o A special operation, called JOIN combines this sequence into a single operation o This operation is very important for any relational database with more than a single relation, because it allows us combine related tuples from various relations o The general form of a join operation on two relations R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:

SEX=F(EMPLOYEE)

where R and S can be any relations that result from general relational algebra expressions. Example: Suppose that we want to retrieve the name of the manager of each department. o To get the managers name, we need to combine each DEPARTMENT tuple with the EMPLOYEE tuple whose SSN value matches the MGRSSN value in the department tuple. o We do this by using the join operation. DEPT_MGR DEPARTMENT
MGRSSN=SSN

<join condition>S

EMPLOYEE

ER-CSEB

12

o MGRSSN=SSN is the join condition o Combines each department record with the employee who manages the department o The join condition can also be specified as DEPARTMENT.MGRSSN= EMPLOYEE.SSN Consider the following JOIN operation: R(A1, A2, . . ., An)
R.Ai=S.Bj S(B1, B2, . . ., Bm)

o Result is a relation Q with degree n + m attributes: o Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order. o The resulting relation state has one tuple for each combination of tuples r from R and s from S, but only if they satisfy the join condition r[Ai]=s[Bj] o Hence, if R has nR tuples, and S has nS tuples, then the join result will generally have less than nR * nS tuples. o Only related tuples (based on the join condition) will appear Some properties of JOIN o The general case of JOIN operation is called a Theta-join: o The join condition is called theta o Theta can be any general Boolean expression on the attributes of R and S; for example: R.Ai<S.Bj AND (R.Ak=S.Bl OR R.Ap<S.Bq) o Most join conditions involve one or more equality conditions ANDed together; for example: R.Ai=S.Bj AND R.Ak=S.Bl AND R.Ap=S.Bq EQUIJOIN Operation o The most common use of join involves join conditions with equality comparisons only o Such a join, where the only comparison operator used is =, is called an EQUIJOIN. o In the result of an EQUIJOIN we always have one or more pairs of attributes (whose names need not be identical) that have identical values in every tuple. o The JOIN seen in the previous example was an EQUIJOIN.

ER-CSEB

13

NATURAL JOIN Operation o Another variation of JOIN called NATURAL JOIN denoted by * was created to get rid of the second (superfluous) attribute in an EQUIJOIN condition. o because one of each pair of attributes with identical values is superfluous o The standard definition of natural join requires that the two join attributes, or each pair of corresponding join attributes, have the same name in both relations o If this is not the case, a renaming operation is applied first Example: To apply a natural join on the DNUMBER attributes of DEPARTMENT and DEPT_LOCATIONS, it is sufficient to write: DEPT_LOCS DEPARTMENT * DEPT_LOCATIONS Only attribute with the same name is DNUMBER An implicit join condition is created based on this attribute: DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER Another example: Q R(A,B,C,D) * S(C,D,E) o The implicit join condition includes each pair of attributes with the same name, ANDed together: o R.C=S.C AND R.D.S.D o Result keeps only one attribute of each such pair: o Q(A,B,C,D,E) DIVISION Operation o The division operation is applied to two relations o R(Z) S(X), where X is a subset of Z. Let Y = Z - X that is, let Y be the set of attributes of R that are not attributes of S. o The result of DIVISION is a relation T(Y) that includes a tuple t if tuples tR appear in R with tR [Y] = t, and with o tR [X] = ts for every tuple ts in S. o For a tuple t to appear in the result T of the DIVISION, the values in t must appear in R in combination

ER-CSEB

14

Additional Relational Operations: Aggregate Functions and Grouping o A type of request that cannot be expressed in the basic relational algebra is to specify mathematical aggregate functions on collections of values from the database. o Examples of such functions include retrieving the average or total salary of all employees or the total number of employee tuples. o These functions are used in simple statistical queries that summarize information from the database tuples. o Common functions applied to collections of numeric values include SUM, AVERAGE, MAXIMUM, and MINIMUM. o The COUNT function is used for counting tuples or values. o Use of the Aggregate Functional operation o o o
MAX Salary

(EMPLOYEE) (EMPLOYEE)

retrieves the maximum salary value retrieves the minimum Salary value

from the EMPLOYEE relation


MIN Salary

from the EMPLOYEE relation


SUM Salary

(EMPLOYEE) retrieves the sum of the Salary from the

EMPLOYEE relation

ER-CSEB

15

COUNT SSN, AVERAGE Salary (EMPLOYEE) computes the count (number) of employees and their average salary Note: count just counts the number of rows, without removing duplicates o The previous examples all summarized one or more attributes for a set of tuples o Grouping can be combined with Aggregate Functions

Example: For each department, retrieve the DNO, COUNT SSN, and AVERAGE SALARY o A variation of aggregate operation allows this: o Grouping attribute placed to left of symbol o Aggregate functions to right of symbol
DNO COUNT SSN, AVERAGE Salary

(EMPLOYEE)

Above operation groups employees by DNO (department number) and computes the count of employees and average salary per department The OUTER JOIN Operation o In NATURAL JOIN and EQUIJOIN, tuples without a matching (or related) tuple are eliminated from the join result o Tuples with null in the join attributes are also eliminated o This amounts to loss of information. o A set of operations, called OUTER joins, can be used when we want to keep all the tuples in R, or all those in S, or all those in both relations in the result of the join, regardless of whether or not they have matching tuples in the other relation. o The left outer join operation keeps every tuple in the first or left relation R in R S; if no matching tuple is found in S, then the attributes of S in the join result are filled or padded with null values. o A similar operation, right outer join, keeps every tuple in the second or right relation S in the result of R S.

o A third operation, full outer join, denoted by keeps all tuples in both the left and the right relations when no matching tuples are found, padding them with null values as needed.

ER-CSEB

16

OUTER UNION Operation o The outer union operation was developed to take the union of tuples from two relations if the relations are not type compatible. o This operation will take the union of tuples in two relations R(X, Y) and S(X, Z) that are partially compatible, meaning that only some of their attributes, say X, are type compatible. o The attributes that are type compatible are represented only once in the result, and those attributes that are not type compatible from either relation are also kept in the result relation T(X, Y, Z). o Example: An outer union can be applied to two relations whose schemas are STUDENT(Name, SSN, Department, Advisor) and INSTRUCTOR(Name, SSN, Department, Rank). o Tuples from the two relations are matched based on having the same combination of values of the shared attributes Name, SSN, Department. o If a student is also an instructor, both Advisor and Rank will have a value; otherwise, one of these two attributes will be null. o The result relation STUDENT_OR_INSTRUCTOR will have the following attributes: STUDENT_OR_INSTRUCTOR (Name, SSN, Department, Advisor, Rank)

ER-CSEB

17

ER-CSEB

18

Relational calculus
Relational algebra and calculus are equivalent in their expressive power. Relational algebra provides a collection of explicit operations - join, union, projection, etc. The operations are used to tell the system how to build some desired relation in terms of other relations. The calculus merely provides a notation for formulating the definition of that desired relation in terms of those given relations.

Relation Algebra vs. Relational Calculus


Relational Algebra is procedural; it is more like a programming language; Relational Calculus is nonprocedural. it is more close to a natural language. For example, suppose you want to query: Get supplier numbers for suppliers who supply part P2. An algebraic version of this query might follow these steps: 1. Form the natural join of relations S and SP on S#; 2. Next, restrict the result of that join to tuples for part P2; 3. Finally, project the result of that restriction on S#. A calculus formulation might look like: Get S# for suppliers such that there exists a shipment SP with the same S# value and with P# value P2. The calculus formation is descriptive while the algebraic one is prescriptive.

ER-CSEB

19

Why it is called relational calculus?


It is founded on a branch of mathematical logic called the predicate calculus.

Codd proposed the concept of a relational calculus (applied predicate calculus tailored to relational databases). o A relational calculus expression creates a new relation, which is specified in terms of variables that range over rows of the stored database relations (in tuple calculus) or over columns of the stored relations (in domain calculus). o In a calculus expression, there is no order of operations to specify how to retrieve the query resulta calculus expression specifies only what information the result should contain. o This is the main distinguishing feature between relational algebra and relational calculus. o Relational calculus is considered to be a nonprocedural or declarative language. o This differs from relational algebra, where we must write a sequence of operations to specify a retrieval request; hence relational algebra can be considered as a procedural way of stating a query.

The Tuple Relational Calculus


1. The tuple relational calculus is a nonprocedural language. (The relational algebra was procedural.) We must provide a formal description of the information desired. 2. A query in the tuple relational calculus is expressed as

i.e. the set of tuples for which predicate is true. 3. We also use the notation o to indicate the value of tuple on attribute . ER-CSEB 20

to show that tuple is in relation .

Example: To find the first and last names of all employees whose salary is above $50,000, we can write the following tuple calculus expression: o {t.FNAME, t.LNAME | EMPLOYEE(t) AND t.SALARY>50000} o The condition EMPLOYEE(t) specifies that the range relation of tuple variable t is EMPLOYEE. o The first and last name (PROJECTION FNAME, LNAME) of each EMPLOYEE tuple t that satisfies the condition t.SALARY>50000 (SELECTION SALARY >50000) will be retrieved.

The Existential and Universal Quantifiers


o Two special symbols called quantifiers can appear in formulas; these are the universal quantifier ) ( and the existential quantifier ).( o Informally, a tuple variable t is bound if it is quantified, meaning that it appears in an ( t) or ( t) clause; otherwise, it is free. o If F is a formula, then so are ( t)(F) and ( t)(F), where t is a tuple variable. o The formula ( t)(F) is true if the formula F evaluates to true for some (at least one) tuple assigned to free occurrences of t in F; otherwise ( t)(F) is false. o The formula ( t)(F) is true if the formula F evaluates to true for every tuple (in the universe) assigned to free occurrences of t in F; otherwise ( t)(F) is false. o is called the universal or for all quantifier because every tuple in the universe of tuples must make F true to make the quantified formula true. o is called the existential or there exists quantifier because any tuple that exists in the universe of tuples may make F true to make the quantified formula true. Examples o Find the names of employees who work on all the projects controlled by department number 5. The query can be: o {e.LNAME, e.FNAME | EMPLOYEE(e) and ( ( x)(not(PROJECT(x)) or not(x.DNUM=5) OR ( ( w)(WORKS_ON(w) and w.ESSN=e.SSN and x.PNUMBER=w.PNO))))} o Exclude from the universal quantification all tuples that we are not interested in by making the condition true for all such tuples. o The first tuples to exclude (by making them evaluate automatically to true) are those that are not in the relation R of interest. o In query above, using the expression not(PROJECT(x)) inside the universally quantified formula evaluates to true all tuples x that are not in the PROJECT

ER-CSEB

21

relation. Then we exclude the tuples we are not interested in from R itself. The expression not(x.DNUM=5) evaluates to true all tuples x that are in the project relation but are not controlled by department 5. o Finally, we specify a condition that must hold on all the remaining tuples in R. ( ( w)(WORKS_ON(w) and w.ESSN=e.SSN and x.PNUMBER=w.PNO)

The Domain Relational Calculus


The domain-oriented calculus differs from the tuple-oriented relational calculus in that it has domain variables instead of tuple variables. That is variables that range over domains instead of over relations. o An expression of the domain calculus is of the form { x1, x2, . . ., xn | COND(x1, x2, . . ., xn, xn+1, xn+2, . . ., xn+m)} o where x1, x2, . . ., xn, xn+1, xn+2, . . ., xn+m are domain variables that range over domains (of attributes) o and COND is a condition or formula of the domain relational calculus.

Examples o Retrieve the birthdate and address of the employee whose name is John B. Smith. o Query : {uv | ( q) ( r) ( s) ( t) ( w) ( x) ( y) ( z) (EMPLOYEE(qrstuvwxyz) and q=John and r=B and s=Smith)} o Abbreviated notation EMPLOYEE(qrstuvwxyz) uses the variables without the separating commas: EMPLOYEE(q,r,s,t,u,v,w,x,y,z) o Ten variables for the employee relation are needed, one to range over the domain of each attribute in order. Of the ten variables q, r, s, . . ., z, only u and v are free. o Specify the requested attributes, BDATE and ADDRESS, by the free domain variables u for BDATE and v for ADDRESS. o Specify the condition for selecting a tuple following the bar ( | )namely, that the sequence of values assigned to the variables qrstuvwxyz be a tuple of the employee relation and that the values for q (FNAME), r (MINIT), and s (LNAME) be John, B, and Smith, respectively.

ER-CSEB

22

ER-to-Relational Mapping Algorithm


Step 1: Mapping of Regular Entity Types Step 2: Mapping of Weak Entity Types Step 3: Mapping of Binary 1:1 Relation Types Step 4: Mapping of Binary 1:N Relationship Types. Step 5: Mapping of Binary M:N Relationship Types. Step 6: Mapping of Multivalued attributes. Step 7: Mapping of N-ary Relationship Types Step 1: Mapping of Regular Entity Types. For each regular (strong) entity type E in the ER schema, create a relation R that includes all the simple attributes of E. Choose one of the key attributes of E as the primary key for R. If the chosen key of E is composite, the set of simple attributes that form it will together form the primary key of R. Example: We create the relations EMPLOYEE, DEPARTMENT, and PROJECT in the relational schema corresponding to the regular entities in the ER diagram. SSN, DNUMBER, and PNUMBER are the primary keys for the relations EMPLOYEE, DEPARTMENT, and PROJECT as shown. Step 2: Mapping of Weak Entity Types For each weak entity type W in the ER schema with owner entity type E, create a relation R and include all simple attributes (or simple components of composite attributes) of W as attributes of R. In addition, include as foreign key attributes of R the primary key attribute(s) of the relation(s) that correspond to the owner entity type(s). The primary key of R is the combination of the primary key(s) of the owner(s) and the partial key of the weak entity type W, if any. Example: Create the relation DEPENDENT in this step to correspond to the weak entity type DEPENDENT. Include the primary key SSN of the EMPLOYEE relation as a foreign key attribute of DEPENDENT (renamed to ESSN). The primary key of the DEPENDENT relation is the combination {ESSN, DEPENDENT_NAME} because DEPENDENT_NAME is the partial key of DEPENDENT.

ER-CSEB

23

ER-CSEB

24

Step 3: Mapping of Binary 1:1 Relation Types For each binary 1:1 relationship type R in the ER schema, identify the relations S and T that correspond to the entity types participating in R. There are three possible approaches: (1) Foreign Key approach: Choose one of the relations-S, say-and include a foreign key in S the primary key of T. It is better to choose an entity type with total participation in R in the role of S. Example: 1:1 relation MANAGES is mapped by choosing the participating entity type DEPARTMENT to serve in the role of S, because its participation in the MANAGES relationship type is total. (2) Merged relation option: An alternate mapping of a 1:1 relationship type is possible by merging the two entity types and the relationship into a single relation. This may be appropriate when both participations are total. (3) Cross-reference or relationship relation option: The third alternative is to set up a third relation R for the purpose of cross-referencing the primary keys of the two relations S and T representing the entity types.

ER-CSEB

25

Step 4: Mapping of Binary 1:N Relationship Types. For each regular binary 1:N relationship type R, identify the relation S that represent the participating entity type at the N-side of the relationship type. Include as foreign key in S the primary key of the relation T that represents the other entity type participating in R. Include any simple attributes of the 1:N relation type as attributes of S. Example: 1:N relationship types WORKS_FOR, CONTROLS, and SUPERVISION in the figure. For WORKS_FOR we include the primary key DNUMBER of the DEPARTMENT relation as foreign key in the EMPLOYEE relation and call it DNO. Step 5: Mapping of Binary M:N Relationship Types. For each regular binary M:N relationship type R, create a new relation S to represent R. Include as foreign key attributes in S the primary keys of the relations that represent the participating entity types; their combination will form the primary key of S. Also include any simple attributes of the M:N relationship type (or simple components of composite attributes) as attributes of S. Example: The M:N relationship type WORKS_ON from the ER diagram is mapped by creating a relation WORKS_ON in the relational database schema. The primary keys of the PROJECT and EMPLOYEE relations are included as foreign keys in WORKS_ON and renamed PNO and ESSN, respectively. Attribute HOURS in WORKS_ON represents the HOURS attribute of the relation type. The primary key of the WORKS_ON relation is the combination of the foreign key attributes {ESSN, PNO}. Step 6: Mapping of Multivalued attributes. For each multivalued attribute A, create a new relation R. This relation R will include an attribute corresponding to A, plus the primary key attribute K-as a foreign key in R-of the relation that represents the entity type of relationship type that has A as an attribute. The primary key of R is the combination of A and K. If the multivalued attribute is composite, we include its simple components. Example: The relation DEPT_LOCATIONS is created. The attribute DLOCATION represents the multivalued attribute LOCATIONS of DEPARTMENT, while DNUMBER-as foreign key-represents the primary key of the DEPARTMENT relation. The primary key of R is the combination of {DNUMBER, DLOCATION}.

ER-CSEB

26

Step 7: Mapping of N-ary Relationship Types. For each n-ary relationship type R, where n>2, create a new relationship S to represent R. Include as foreign key attributes in S the primary keys of the relations that represent the participating entity types. Also include any simple attributes of the n-ary relationship type (or simple components of composite attributes) as attributes of S. Example: The relationship type SUPPY in the ER below. This can be mapped to the relation SUPPLY shown in the relational schema, whose primary key is the combination of the three foreign keys {SNAME, PARTNO, PROJNAME}

ER-CSEB

27

SUMMARY -: Correspondence between ER and Relational Models ER Model Entity type 1:1 or 1:N relationship type M:N relationship type n-ary relationship type Simple attribute Composite attribute Multivalued attribute Value set Key attribute Relational Model Entity relation Foreign key (or relationship relation) Relationshiprelation and two foreign keys Relationship relation and n foreign keys Attribute Set of simple component attributes Relation and foreign key Domain Primary (or secondary) key

ER-CSEB

28

You might also like