drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From arina-ielchiieva <...@git.apache.org>
Subject [GitHub] drill pull request #794: DRILL-5375: Nested loop join: return correct result...
Date Sat, 25 Mar 2017 13:36:56 GMT
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/794#discussion_r108035986
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
---
    @@ -105,6 +103,29 @@
       public static final PositiveLongValidator PARQUET_ROWGROUP_FILTER_PUSHDOWN_PLANNING_THRESHOLD
= new PositiveLongValidator(PARQUET_ROWGROUP_FILTER_PUSHDOWN_PLANNING_THRESHOLD_KEY,
           Long.MAX_VALUE, 10000);
     
    +  /*
    +     Enables rules that re-write query joins in the most optimal way.
    +     Though its turned on be default and its value in query optimization is undeniable,
user may want turn off such
    +     optimization to leave join order indicated in sql query unchanged.
    +
    +      For example:
    +      Currently only nested loop join allows non-equi join conditions usage.
    +      During planning stage nested loop join will be chosen when non-equi join is detected
    +      and {@link #NLJOIN_FOR_SCALAR} set to false. Though query performance may not be
the most optimal in such case,
    +      user may use such workaround to execute queries with non-equi joins.
    +
    +      Nested loop join allows only INNER and LEFT join usage and implies that right input
is smaller that left input.
    +      During LEFT join when join optimization is enabled and detected that right input
is larger that left,
    +      join will be optimized: left and right inputs will be flipped and LEFT join type
will be changed to RIGHT one.
    +      If query contains non-equi joins, after such optimization it will fail, since nested
loop does not allow
    +      RIGHT join. In this case if user accepts probability of non optimal performance,
he may turn off join optimization.
    +      Turning off join optimization, makes sense only if user are not sure that right
output is less or equal to left,
    +      otherwise join optimization can be left turned on.
    +
    +      Note: once hash and merge joins will allow non-equi join conditions,
    +      the need to turn off join optimization may go away.
    +   */
    +  public static final BooleanValidator JOIN_OPTIMIZATION = new BooleanValidator("planner.enable_join_optimization",
true);
    --- End diff --
    
    JOIN_OPTIMIZATION enables two rules `DRILL_JOIN_TO_MULTIJOIN_RULE` and `DRILL_LOPT_OPTIMIZE_JOIN_RULE`
which are applicable for any types of joins. That's why naming is quite broad, I believe these
two rules are not only responsible for join swap but for all other join optimization techniques.
In our use case, user may want to disable them when he doesn't won't join swap to be performed
but there may other reasons. Though as I have noted, when we implement non-equality joins
for hash and merge joins, we may remove this configuration parameter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message