db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Army <qoz...@sbcglobal.net>
Subject [OPTIMIZER] Optimizer "timeout" for subqueries?
Date Fri, 03 Mar 2006 00:47:52 GMT
While working on changes for DERBY-805 and DERBY-1007, I noticed that the 
"timeout" mechanism for the Derby optimizer is never reset for subqueries.  As 
is usually required when talking about the optimizer, let me give an example of 
what I mean.

If we have a query such as:

select <...> from
   (select t1.i, t2.j from t1, t2 where <...>) X1,
    T3
where <...>

then we would have one "outer" query and one "subquery".  The outer query would 
be "select <...> from X1, T3", the subquery would be "select t1.i, t2.j from t1, 
t2".

In this case Derby will create two instances of OptimizerImpl: one for the outer 
query (call it OI_OQ) and one for the subquery (call it OI_SQ).  Each 
OptimizerImpl has its own timeout "clock" that it initializes at creation 
time--but never resets.  If timeout occurs, the OptimizerImpl will stop 
searching for "the" best plan and will just take the best plan found so far.

That said, for every permutation of the outer query a call will be made to 
optimize the subquery.  To simplify things, let's assume there are only two 
permutations of the outer query: one with join order {X1, T3} and another with 
join order {T3, X1}.

Now let's say we're looking at the first permutation {X1, T3}.  OI_OQ will make 
a call to optimize the subquery represented by OI_SQ.  Let's further say that 
the subquery tries some permutation {T1, T2} and then times out.  It then 
returns the plan information for {T1, T2} to the outer query.  The outer query, 
which has *not* yet timed out, then decides to try its second permutation {T3, 
X1}.  So it again makes a call to optimize the subquery.  In this case, the 
subquery--which has already timed out--will *immediately* return without trying 
to optimize anything.  The outer query will then make a decision about its 
second permutation based on the un-optimized subquery's plan results.

My question is: is this intentional behavior?

On the one hand, I can sort of see the logic in not trying to optimize the 
subquery after the first time, because if we've "timed out" then we just want to 
"wrap things up" as quickly as possible.

On the other hand, the outer query--which is the query representing the 
top-level statement that the user is executing--has *not* yet timed out, so it 
seems to me like the second call to optimize the subquery should get a "clean 
start" instead of timing out right away.

This hasn't really been an issue to date because the "best plan" chosen by the 
subquery is typically independent of the outer query's current permutation--with 
the exception of "outerCost", which is passed in from the outer query and is 
factored into the subquery's cost estimates.  Because of this relative 
independence, the plan chosen by the subquery would rarely (if ever?) change 
with different permutations of the outer query, so if the subquery timed out 
once there was no point in trying to re-optimize it again later.

With my changes for DERBY-805, though, I'm introducing the notion of pushing 
predicates from outer queries down into subqueries--which means that the outer 
join order can have a very significant impact on the plan chosen by the 
subquery.  But because the timeout mechanism is never reset, we could end up 
skipping the second optimization phase of the subquery, which means we never get 
a chance to see how much the outer predicates can help, and thus we could end up 
skipping over some plans that have the potential to give us significant 
performance improvement.

It's hard to come up with a concrete example of this because it depends on 
optimizer timeout, which varies with different machines.  But in my work with 
DERBY-805 I've seen cases where the optimizer ignores pushed predicates because 
of subquery timeout and thus ends up choosing a (hugely) sup-optimal plan.

All of that said, in an attempt to resolve this issue I added the following two 
lines to the "prepForNextRound()" method in OptimizerImpl (that method was only 
recently added as part of a change for DERBY-805):

+	/* Reset timeout state since we want to timeout on a per-round
+	 * basis.  Otherwise if we timed out during a previous round and
+	 * then we get here for another round, we'll immediately
+	 * "timeout" again before optimizing any of the Optimizables--
+	 * which means we could be missing out on some helpful
+	 * optimizations that weren't available for the previous
+	 * round (esp. use of indexes/qualifiers that are made possible
+	 * by predicates pushed down from the outer query).  So by
+	 * resetting the timeout state here, we give ourselves a chance
+	 * to do some legitimate optimization before (potentially) timing
+	 * out again.  This could allow us to find some access plans that
+	 * are significantly better than anything we found in previous
+	 * rounds.
+	 */
+	timeOptimizationStarted = System.currentTimeMillis();
+	timeExceeded = false;

I applied this change locally in combination with the DERBY-1007 patch and ran 
derbyall on Linux Red Hat with IBM 1.4.2 with no failures.

Does anyone out there know if this is a safe change?  I would appreciate any 
feedback/advice here.  If I don't hear any objections, I'll file a sub-task to 
DERBY-805 for this issue and post the above patch there.

Many thanks in advance to anyone who might have input/feedback here.
Army


Mime
View raw message