db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4587) Add tools for improved analysis and understanding of query plans and execution statistics
Date Sun, 30 May 2010 19:19:37 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873472#action_12873472
] 

Bryan Pendleton commented on DERBY-4587:
----------------------------------------

Hi Nirmal,

It seems that the current XML document built by the v3 patch is capturing all
of the result set nodes, but it is "flattening" them into a simple list, which loses
some of the information that is present in the query plan.

I think that we will want to build a richer data structure in the AccessDatabase class,
in order to capture the the natural tree structure of a query plan. Entries in a query
plan are structured into parent-child relationships, (there's a nice writeup on the
basic concepts here: http://en.wikipedia.org/wiki/Query_optimizer) and we need 
to be able to capture that structure into the natural tree structure of the 
resulting XML document.

For example, consider this simple query:

  create table t1 (c1 int, c2 varchar(100));
  create table t2 (a int, b int);
  select t1.c1, t1,c2 from t1 inner join t2 on c1 = a;

This query is executed using a 3-level query tree, which can be seen
by looking at the RS_ID and PARENT_RS_ID columns in SYSXPLAN_RESULTSETS:

ij> select rs_id, op_identifier, parent_rs_id from s2.sysxplain_resultsets;
RS_ID                                                                   |OP_IDENTIFIER   
             |PARENT_RS_ID                        
--------------------------------------------------------------------------------------------------------
2589c06b-0128-ea8c-c9a5-000000e76260|PROJECTION                    |NULL                 
              
addac06e-0128-ea8c-c9a5-000000e76260|HASHJOIN                      |2589c06b-0128-ea8c-c9a5-00000e76260
862e0071-0128-ea8c-c9a5-000000e76260|TABLESCAN                     |addac06e-0128-ea8c-c9a5-000000e76260
5e838074-0128-ea8c-c9a5-000000e76260|HASHSCAN                      |addac06e-0128-ea8c-c9a5-000000e76260

The PROJECTION node is the "root" of the tree, and it has 1 child, the HASHJOIN node,
which in turn has two children, the TABLESCAN and HASHSCAN nodes.

As a tree, it looks something like:

                PROJECTION
                         |
                HASHJOIN
                        |
        +----------+--------------+
        |                                    |
  TABLESCAN           HASHSCAN

When we format this query into an XML document, the tree structure needs to
be preserved, so that we end up with an XML document which expresses
the tree structure using nesting of XML nodes, so that  we get something like:

  <node>
  <type>PROJECTION</type>
      <node>
      <type>HASHJOIN</type>
          <node>
          <type>TABLESCAN</type>
          </node>
          <node>
          <type>HASHSCAN</type>
          </node>
      </node>
  </node>

I think that the tool will need a more sophisticated method for analyzing
the contents of the SYSXPLAIN tables in order to deduce this query
plan structure from the PARENT_RS_ID values. 

One idea is to use some of Java's built-in tree-structured collections
classes, such as a java.util.TreeMap, to retrieve the data from the
SYSXPLAIN tables, then after the data has been retrieved, traverse the
TreeMap in order to emit the XML document with the structure expressed
as XML node containment.

Does this make sense?


> Add tools for improved analysis and understanding of query plans and execution statistics
> -----------------------------------------------------------------------------------------
>
>                 Key: DERBY-4587
>                 URL: https://issues.apache.org/jira/browse/DERBY-4587
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL, Tools
>            Reporter: Bryan Pendleton
>            Assignee: C.S. Nirmal J. Fernando
>         Attachments: Derby Query Plan Screen Shot 2.jpg, DERBY-4587-tool-2.diff, DERBY-4587-tool-3.diff,
DERBY-4587-tool.diff, Derby_Query_Plan_Screen_Shot.jpg, PostgreSQL license.jpg, Read_Me.txt,
Source.rar
>
>
> I think it would be great to see some work in the area of tools for helping
> with the analysis of complex query execution. Quite frequently, users of
> Derby have trouble comprehending (a) how their query is being translated
> into a query plan by the optimizer, and (b) what the execution-time resource
> usage of the various parts of the query is.
> There are low-level features in Derby which capture this information and
> record it, such as logQueryPlan, and the XPLAIN tables, but there is a lot
> of opportunity for designing higher-level tools which can process the query
> plan and execution statistics information and present it in a more
> comprehensible fashion. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message