drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4147) Union All operator runs in a single fragment
Date Mon, 01 Aug 2016 18:31:20 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402584#comment-15402584
] 

ASF GitHub Bot commented on DRILL-4147:
---------------------------------------

Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/555#discussion_r73028431
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/UnionAllPrule.java
---
    @@ -61,7 +61,8 @@ public void onMatch(RelOptRuleCall call) {
             convertedInputList.add(convertedInput);
           }
     
    -      traits = call.getPlanner().emptyTraitSet().plus(Prel.DRILL_PHYSICAL).plus(DrillDistributionTrait.SINGLETON);
    +      // distribution trait is set to ANY to allow Union-All to inherit the distribution
of its child
    +      traits = call.getPlanner().emptyTraitSet().plus(Prel.DRILL_PHYSICAL).plus(DrillDistributionTrait.ANY);
    --- End diff --
    
    Will it work by setting "ANY" in case inputs of union-all has different distribution trait?
    
    Let's say LHS has Hash-distribution(col1), RHS has Hash-distribution(co1, col2), or Random
distribution. What will union-all's distribution trait? Will it use LHS, or RHS ? 



> Union All operator runs in a single fragment
> --------------------------------------------
>
>                 Key: DRILL-4147
>                 URL: https://issues.apache.org/jira/browse/DRILL-4147
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: amit hadke
>            Assignee: Aman Sinha
>
> A User noticed that running select  from a single directory is much faster than union
all on two directories.
> (https://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/#comment-2349732267)

> It seems like UNION ALL operator doesn't parallelize sub scans (its using SINGLETON for
distribution type). Everything is ran in single fragment.
> We may have to use SubsetTransformer in UnionAllPrule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message