Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 590BC17B7B for ; Sat, 9 May 2015 15:13:49 +0000 (UTC) Received: (qmail 30549 invoked by uid 500); 9 May 2015 15:13:49 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 30487 invoked by uid 500); 9 May 2015 15:13:49 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 30476 invoked by uid 500); 9 May 2015 15:13:48 -0000 Delivered-To: apmail-incubator-drill-dev@incubator.apache.org Received: (qmail 30471 invoked by uid 99); 9 May 2015 15:13:48 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 May 2015 15:13:48 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id CEFA31DD29E; Sat, 9 May 2015 15:13:49 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============7262373093219877173==" MIME-Version: 1.0 Subject: Re: Review Request 34006: DRILL-2958: Move Drill to alternative cost-based planner for Join planning From: "Jinfeng Ni" To: "Aman Sinha" Cc: "drill" , "Jinfeng Ni" Date: Sat, 09 May 2015 15:13:49 -0000 Message-ID: <20150509151349.1563.73955@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Jinfeng Ni" X-ReviewGroup: drill-git X-ReviewRequest-URL: https://reviews.apache.org/r/34006/ X-Sender: "Jinfeng Ni" References: <20150509055746.1563.62320@reviews.apache.org> In-Reply-To: <20150509055746.1563.62320@reviews.apache.org> Reply-To: "Jinfeng Ni" X-ReviewRequest-Repository: drill-git --===============7262373093219877173== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On May 8, 2015, 10:57 p.m., Aman Sinha wrote: > > exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java, line 529 > > > > > > Is the criteria for using Lopt optimizer (in terms of number of tables above a certain threshold) applied internally ? We should have a Drill specific setting for it beyond just a true/false setting. > > Jinfeng Ni wrote: > I think it probably makes sense to avoid LOPT planner for single table query. For any query with JOIN, since the current planer does not enable SwapJoin rule in the logical planning phase, it may not find the optimal plan, and rely on a post-planing method to swap join based on rowcount. In that sense, I feel LOPT planner might be a better choice even for 2 or 3 tables join. > > For single table query, since there is no join, it seems no difference between LOPT / the current planner. That's why I did not add a threashold. Another reason that I did not add a threashold option is we have added a new option to swtich between the existing planner and new planner. User could simply turn on/off that option, to completely switch to one of them. Having both the switch option and threashold option probably will cause more confusing, IMHO. - Jinfeng ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34006/#review83134 ----------------------------------------------------------- On May 8, 2015, 5:20 p.m., Jinfeng Ni wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/34006/ > ----------------------------------------------------------- > > (Updated May 8, 2015, 5:20 p.m.) > > > Review request for drill and Aman Sinha. > > > Repository: drill-git > > > Description > ------- > > Drill current use VolcanoPlanner in join planning. This planner has two known issues: > > 1. The search space is increased exponentially with increased # of tables joined. If query has more than > 10 tables join, the planning time itself could be minutes, if not longer. > > 2. Drill did not enable a rule to swap both sides of join, due to the search space problem. We only do a swap join afterwards. See DRILL-2236. This means the join order chosen by Drill's VolcanoPlanner might not be optimal. > > To address the above two issues, we are going to provide another planner for the purpose of join ordering planning. This planner will use a different optimization rules, and the search space is not increased exponentially with # of table. > > The main logic of this new planner: > 1) Let VolcanoPlanner do all the rule transformations same as the current planner's logical planning, except for the join permutation rule. > 2) After that, pass to HepPlanner with Calcite LOPT optimization rule, to let it do the join ordering. Feed with the HepPlanner with Drill's RelMetaDataProvider, to leverage the statistics (rowcount) available in Drill's table/files. > 3) Continue with the same physical planning as before. > > With the limited statistics available in Drill, the new planner seems to produce better query plan than the current, for several TPCH queries. > > Preliminary performance results show this planner run faster than the existing one, and the join plan seems to be same or better than the plan chosen by the existing planner. > > Will update more in detail about the comparison. > > > Diffs > ----- > > exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 5ab416c > exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 42ef6ac > exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillDefaultRelMetadataProvider.java PRE-CREATION > exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdDistinctRowCount.java PRE-CREATION > exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java dbd08f4 > exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java dcccdb0 > exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 6e132aa > exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java 2981de8 > exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRelFactories.java PRE-CREATION > exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 53e1bff > exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java 7d8dd97 > exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java 3c78c08 > exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java eda1b5f > exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java 4d8b034 > > Diff: https://reviews.apache.org/r/34006/diff/ > > > Testing > ------- > > Unit test / Regression suite. > > > Thanks, > > Jinfeng Ni > > --===============7262373093219877173==--