db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Hillegas (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-6211) Make Optimizer trace logic pluggable.
Date Thu, 08 Aug 2013 19:10:51 GMT

     [ https://issues.apache.org/jira/browse/DERBY-6211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Rick Hillegas updated DERBY-6211:

    Attachment: derby-6211-13-aa-SelectNode_optimizer.diff

Attaching derby-6211-13-aa-SelectNode_optimizer.diff. This patch moves the getOptimizer()
method out of ResultSetNode down into SelectNode, the only subclass which actually calls it.
The optimizer creation in SelectNode is balanced by a call to traceEndQueryBlock() just as
was done to TableOperatorNode in the previous patch. I am running tests now.

At this point, the SelectNode and TableOperatorNode are the only nodes which create optimizers.
This helps us reason about where optimizable query blocks occur and ensure that every optimizer
creation is balanced with a call to traceEndQueryBlock() after join optimization via that
optimizer is finished.

I would like to be able to hide or throw away the OptimizerImpl after join optimization is
done. But that is not possible right now. Before explaining why, let me first sketch my current
understanding of the phases of optimization:

1) Preprocessing. This is the phase in which the query is rewritten. Rewriting tasks include
subquery flattening and putting predicates into conjunctive normal form.

2) Join optimization. Most of what we describe as optimization happens in this phase. It is
the phase which selects join orders, join strategies, and access paths.

3) Projection and restriction. This is the phase in which predicates are pushed down as close
to the row sources as possible. And rows are pruned back to the minimal number of columns
needed by higher operators in the plan. The workhorses for this phase are the modifyAccessPath()
methods. There is some other miscellaneous, mechanical processing in this phase related to
the CostEstimates which were calculated during join optimization.

OptimizerImpl is responsible for phases (2) and (3). When a ResultSetNode is join optimized,
the OptimizerImpl is saved away in ResultSetNode.optimizer so that it can be dug up later
to perform projection and restriction. The OptimizerImpl used for join optimization retains
information which is needed for projection and restriction.

I don't like the fact that the OptimizerImpl is stashed away. That makes it hard to reason
about when join optimization is done. I hope that the traceEndQueryBlock() calls now mark
the end of join optimization attempts. I would prefer to save just the projection/restriction
and CostEstimate variables rather than the whole OptimizerImpl. But that's a rototill I don't
want to embark on now.

Touches the following files:


M       java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java
M       java/engine/org/apache/derby/impl/sql/compile/SelectNode.java

Moves getOptimizer() down into SelectNode and adds a traceEndQueryBlock() call.


M       java/testing/org/apache/derbyTesting/functionTests/tests/lang/XMLOptimizerTraceTest.java

Adds a test for xml-based optimizer tracing of outer joins.

> Make Optimizer trace logic pluggable.
> -------------------------------------
>                 Key: DERBY-6211
>                 URL: https://issues.apache.org/jira/browse/DERBY-6211
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions:
>            Reporter: Rick Hillegas
>            Assignee: Rick Hillegas
>              Labels: derby_triage10_11
>         Attachments: derby-6211-01-aa-createPlugin.diff, derby-6211-02-aa-cleanup.diff,
derby-6211-02-ab-cleanup.diff, derby-6211-03-aa-customTracer.diff, derby-6211-04-aa-moveOptimizerTracerToEngineJar.diff,
derby-6211-05-aa-xmlOptimizerTracer.diff, derby-6211-06-ab-packageProtect-XMLOptTrace.diff,
derby-6211-07-aa-useSchemaQualifiedNamesInSummaries.diff, derby-6211-07-ab-useSchemaQualifiedNamesInSummaries.diff,
derby-6211-08-aa-fixNPE.diff, derby-6211-09-aa-addTests.diff, derby-6211-10-aa-makingCostEstimateObject.diff,
derby-6211-11-aa-moveTracerOutOfOptimizer.diff, derby-6211-11-ab-moveTracerOutOfOptimizer.diff,
derby-6211-12-aa-traceEndOfQueryBlock.diff, derby-6211-13-aa-SelectNode_optimizer.diff
> Right now the trace logic in the optimizer is hard-coded to produce a stream of diagnostics.
It would be good to be able to plug alternative trace logic into the optimizer. This would
make the following possible:
> 1) Plug in trace logic which produces formats which are easier to study and which can
be analyzed mechanically. E.g., xml formatted output.
> 2) Plug in trace logic which can be used during unit testing to verify that the optimizer
has picked the right plan. Over time this might make it easier to migrate canon-based tests
to assertion-based tests.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message