db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Army <qoz...@gmail.com>
Subject Re: Question regarding runtimestatistics and join order
Date Thu, 02 Oct 2008 16:05:56 GMT
Knut Anders Hatlen wrote:
> Manjula Kutty <manjula.kutty@gmail.com> writes:
>> In the given plan did you find anything about the join order??
> 
> I think the join order is T1, T3. Under Hash Join ResultSet, the plan
> says that the left result set is a Distinct Scan ResultSet for T1, and
> the right result set is a Hash Table ResultSet whose source result set
> is a Distinct Scan ResultSet for T3.

Right.

As a general rule, when looking at a given query plan the join order is 
reflected by the order in which you see the table names as you scroll 
from the top of the plan downward.

In the plan that Manjula originally posted "T1" appears first (i.e. 
closer to the beginning of the plan), then "T3" appears afterward, so 
that's a general indication that the join order is "T1", "T3".

Note that with respect to the top-level query in question, i.e. to:

   select x1.j, x2.b from
     (select distinct i,j from t1) x1,
     (select  distinct a,b from t3) x2
   where x1.i = x2.a
   order by x1.j, x2.b

the join order is _technically_ { X1, X2 }.  But the query plan only 
shows base table access, so we have to look to see what tables X1 and X2 
access.  In this case X1 accesses T1 while X2 accesses T3, so when we 
scan the plan and see { T1, T3 }, that effectively implies a join order 
of { X1, X2 }.  I'm not sure if the test comments are consistent in 
terms of whether they'll say { T1, T3 } or { X1, X2 }, so I thought I'd 
mention it...

On a more general level, the "scan downward" approach to finding the 
join order works because the query plan is written in terms of "left" 
and "right" result sets, as Knut Anders mentioned.  If I'm joining three 
tables T1, T2, T3 and the join order chosen by the optimizer is {T2, T3, 
T1} the final query tree will look something like:

        JOIN_0
         /  \
     JOIN_1  T1
       /  \
      T2   T3

Notice how each join node has a "left" and a "right" child.  The query 
plan is generated in depth-first traversal order (starting with the 
root), so the query plan for the above tree would look something like:

JoinResultSet_0:
+++ LeftResultSet:
++++++ JoinResultSet_1:
+++++++++ LeftResultSet:  T2
+++++++++ RightResultSet: T3
+++ RightResultSet:       T1

 From this we can see that the order in which the tables appear in the 
query plan (reading top to bottom) will match the order that comes from 
reading the leaf nodes of the join tree left-to-right, and that in turn 
reflects the join order chosen by the optimizer.

Army


Mime
View raw message