Chapter 5: Query Optimization: Acknowledgements: Slides Are Adapted From Böhlen and
Chapter 5: Query Optimization: Acknowledgements: Slides Are Adapted From Böhlen and
Introduction
Statistical information for cost estimation
Transformation of relational expressions (equivalence rules)
Rule-based and cost-based optimization
Optimizing nested subqueries
Materialized views and view maintenance
1 2 (are
2. Selection operations 1 ( 2 ( E ))
E ) commutative.
Rule 5
Rule 6a
Rule 7a
Equivalence Rules ...
7. The selection operation distributes over the theta join
operation under the following two conditions:
(a) When all the attributes in 0 involve only the attributes of
one of the expressions (E1) being joined:
L1 L2 ( E1....... E2 ) ( L1 ( E1 ))...... ( L2 ( E2 ))
(b) Consider a join E1 E2 .
Let L3 be attributes of E1 that are involved in join condition , but are not in
L1 L2, and
let L4 be attributes of E2 that are involved in join condition , but are not in
L1 L2.
(r1 r 2) r3
so that we compute and store a smaller temporary relation.
Transformation Examples ...
Example 5: Join ordering
Consider the expression
customer-name ((branch-city = “Brooklyn” (branch)) account depositor)
Could compute (account depositor) first, and join result with
branch-city = “Brooklyn” (branch)
but (account depositor) is likely to be a large relation.
Since it is more likely that only a small fraction of the bank’s
customers have accounts in branches located in Brooklyn, it is
better to compute first
branch-city = “Brooklyn” (branch) account
Enumeration of Equivalent Expr.
Query optimizers use equivalence rules to systematically
generate expressions equivalent to the given expression
Algorithm
Repeat
For each expression found so far, use all applicable equivalence rules, and add newly
generated expressions to the set of expressions found so far
Until no more expressions can be found
This approach is very expensive in space and time
Reduce space requirements by sharing common sub-expr.:
When E1 is generated from E2 by an equivalence rule, usually only the
top level of the two are different, subtrees below are the same and can
be shared (e.g. when applying join associativity)
Time requirements are reduced by not generating all expressions
(e.g. take cost estimates into account)
Evaluation Plan
An evaluation plan (query plan) defines exactly what algorithm
is used for each operation, and how the execution of the
operations is coordinated.
Also called evaluation tree (query tree)
Choice of Evaluation Plans
When choosing the “best” evaluation plan, the query optimizer
must consider the interaction of evaluation techniques:
Choosing the cheapest algorithm for each operation independently may
not yield best overall algorithm, e.g.
merge-join may be costlier than hash-join, but may provide a sorted output which
reduces the cost for an outer level aggregation.
nested-loop join may provide opportunity for pipelining