db-derby-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Db-derby Wiki] Update of "ReferencedTableMaps" by Army
Date Fri, 10 Nov 2006 22:30:31 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Db-derby Wiki" for change notification.

The following page has been changed by Army:
http://wiki.apache.org/db-derby/ReferencedTableMaps

New page:
As part of query "preprocessing", which is the first of Derby's [http://wiki.apache.org/db-derby/LanguageOptimize
three phases of optimization], every result set node in the query tree figures out what its
"referenced table map" is. In Derby terminology a "table map" is a bit map which, as a general
rule, contains one bit for every Optimizable in the query that has a non-negative [http://wiki.apache.org/db-derby/OptimizerTableNumbers
table number]. An "Optimizable" is defined as any result set node that can occur in the FROM
clause of a SELECT statement. So for a given result set node {{{rsn}}} in some query, {{{rsn}}}'s
"referenced table map" is a table map in which the bits corresponding to any Optimizables
that {{{rsn}}} references have been set. Note that a "referenced table map" is also called
a "referenced set" in some parts of the Derby code.

That said, each Optimizable is responsible for building its own "referenced table map". A
!FromBaseTable, for example, simply sets the bit that corresponds to itself. Likewise for
a !RowResultSetNode. Any class that extends !SetOperatorNode, though,has a referenced map
that is the "or" of its children's maps. And so on. At the time of writing the following result
set nodes set their referenced maps as indicated here:

  * !CurrentOfNode: empty
  * !FromBaseTable: self
  * !FromSubquery: child's map
  * FromVTI: self
  * !ProjectRestrictNode: child's map
  * !RowResultSetNode: self
  * !SelectNode: "or" of all maps for Optimizables in the FROM list
  * !SetOperatorNode (!UnionNode, !IntersectOrExceptNode): left child's map "or'd" with right
child's map, but NO self
  * !SingleChildResultSetNode: child's map
  * !TableOperatorNode (!JoinNode, !HalfOuterJoinNode): left child's map "or'd" right child's
map, PLUS self

As an example, assume we have the following query:

{{{
select ppl_info.id, ppl_info.fullname
  from happy_ppl_ids, ppl_info
  where happy_ppl_ids.id = ppl_info.id
}}}

In this query there are two Optimizables: the first is {{{happy_ppl_ids}}}, the second is
{{{ppl_info}}}. So all "table maps" for this query will have two bits in them, one for each
Optimizable. As part of [http://wiki.apache.org/db-derby/LanguageBinding query binding] each
Optimizable will get assigned a specific [http://wiki.apache.org/db-derby/OptimizerTableNumbers
"table number"]. In the the example above {{{happy_ppl_ids}}} will end up with table number
"0" and {{{ppl_info}}} will end up with table number "1". The referenced table map for the
!SelectNode will then have both bits set because it (the !SelectNode) references both tables.
The referenced map for {{{happy_ppl_info}}} will only have bit "0" set; the referenced map
for {{{ppl_info}}} will only have bit "1" set.

In addition to result set nodes, there are other classes in Derby that contain "referenced
table" information, as well. Of particular interest are the Predicate and the !ColumnReference
classes. The {{{referencedSet}}} field of a Predicate indicates which Optimizables are referenced
by that Predicate. In the example above, the predicate {{{(happy_ppl_ids.id = ppl_info.id)}}}
will reflect the tables referenced by the equality, and thus it will have both bits "0" and
bits "1" set. As for !ColumnReferences, a specific column reference does not have a "bit map"
per se, but it does have a field to hold the table number of the Optimizable to which it (the
!ColumnReference) points. In the example query there are actually four column references:

  1. {{{ppl_info.id}}} from the SELECT's result column list
  1. {{{ppl_info.fullname}}} from the SELECT's result column list
  1. {{{happy_ppl_ids.id}}} from the left side of the predicate
  1. {{{ppl_info.id}}} from the right side of the predicate

The third column reference in this list will have its {{{tableNumber}}} field set to "0" because
it is pointing to {{{happy_ppl_ids}}} and we said above that {{{happy_ppl_ids}}} was designated
as table "0". All other column references in this list will have table number "1" because
they point to {{{ppl_info}}}.

For more on table numbers, see OptimizerTableNumbers.

Both Optimizables in the example above are base tables, but that is not a requirement. Nor
is it a requirement that the Optimizables occur as part of a SELECT query; they may be created
in other ways, as well. For example if we have the following query:

{{{
values 'hi', 'bye', 'salut', 'ciao'
}}}

Derby will actually create SEVEN Optimizables, as demonstrated by the following query tree
("RRSN" stands for !RowResultSetNode, which is an instance of Optimizable representing a single
row in the result set returned by a VALUES query):

{{{
                     UNION[6]
                      /    \
                 UNION[4]   RRSN[5] ("ciao")
                /      \
          UNION[2]      RRSN[3] ("salut")
            /    \
  RRSN[0] ("hi")  RRSN[1] ("bye")
}}}

In this tree the bracketed digit for each Optimizable represents the "table number" assigned
to that Optimizable during binding. The quoted strings in parentheses correspond to rows in
the VALUES result set. For the above query, then, the "referenced table map" for each UNION
and !RowResultSetNode will have seven bits in it, with the bits set as follows (where "{ i
}" means that the bit corresponding to the Optimizable whose table number is {{{i}}} has been
set):

  * RRSN[0]:  { 0 }
  * RRSN[1]:  { 1 }
  * RRSN[3]:  { 3 }
  * RRSN[5]:  { 5 }
  * UNION[2]: { 0, 1 }
  * UNION[4]: { 0, 1, 3 }
  * UNION[6]: { 0, 1, 3, 5 }

Generally speaking "referenced table maps" are useful for determining what dependencies exist
between the different result set nodes, columns, and column references within a query. They
also make it possible for the Derby optimizer to figure out when and where to push predicates
down the query tree, as described in more detail on the PredicatePushdown page.

Mime
View raw message