drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni" <...@maprtech.com>
Subject Review Request 34006: DRILL-2958: Move Drill to alternative cost-based planner for Join planning
Date Sat, 09 May 2015 00:11:34 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34006/
-----------------------------------------------------------

Review request for drill and Aman Sinha.


Repository: drill-git


Description
-------

Drill current use VolcanoPlanner in join planning. This planner has two known issues:

1. The search space is increased exponentially with increased # of tables joined. If query
has more than > 10 tables join, the planning time itself could be minutes, if not longer.

2. Drill did not enable a rule to swap both sides of join, due to the search space problem.
We only do a swap join afterwards. See DRILL-2236. This means the join order chosen by Drill's
VolcanoPlanner might not be optimal.

To address the above two issues, we are going to provide another planner for the purpose of
join ordering planning. This planner will use a different optimization rules, and the search
space is not increased exponentially with # of table. 

The main logic of this new planner:
1) Let VolcanoPlanner do all the rule transformations same as the current planner's logical
planning, except for the join permutation rule.
2) After that, pass to HepPlanner with Calcite LOPT optimization rule, to let it do the join
ordering. Feed with the HepPlanner with Drill's RelMetaDataProvider, to leverage the statistics
(rowcount) available in Drill's table/files. 
3) Continue with the same physical planning as before.

With the limited statistics available in Drill, the new planner seems to produce better query
plan than the current, for several TPCH queries. 

Preliminary performance results show this planner run faster than the existing one, and the
join plan seems to be same or better than the plan chosen by the existing planner. 

Will update more in detail about the comparison.


Diffs
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java
5ab416c 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java
42ef6ac 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillDefaultRelMetadataProvider.java
PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdDistinctRowCount.java
PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java dbd08f4

  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java dcccdb0

  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java
6e132aa 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java
2981de8 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRelFactories.java
PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 53e1bff

  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
7d8dd97 
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java 3c78c08

  exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
eda1b5f 
  exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
4d8b034 

Diff: https://reviews.apache.org/r/34006/diff/


Testing
-------

Unit test / Regression suite.


Thanks,

Jinfeng Ni


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message