drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4147) Union All operator runs in a single fragment
Date Tue, 02 Aug 2016 00:10:20 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403078#comment-15403078
] 

ASF GitHub Bot commented on DRILL-4147:
---------------------------------------

Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/555#discussion_r73075658
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/UnionAllPrule.java
---
    @@ -61,7 +61,8 @@ public void onMatch(RelOptRuleCall call) {
             convertedInputList.add(convertedInput);
           }
     
    -      traits = call.getPlanner().emptyTraitSet().plus(Prel.DRILL_PHYSICAL).plus(DrillDistributionTrait.SINGLETON);
    +      // distribution trait is set to ANY to allow Union-All to inherit the distribution
of its child
    +      traits = call.getPlanner().emptyTraitSet().plus(Prel.DRILL_PHYSICAL).plus(DrillDistributionTrait.ANY);
    --- End diff --
    
    The union-all's distribution trait will be ANY (I should modify the comment since it is
not accurate).   If there is another downstream operator after union-all, such as Aggregation
or Join, that operator will impose its own distribution requirement.  Treating the output
trait as ANY is at least better than the current SINGLETON but we are not propagating either
the LHS or RHS trait...this could be an enhancement.


> Union All operator runs in a single fragment
> --------------------------------------------
>
>                 Key: DRILL-4147
>                 URL: https://issues.apache.org/jira/browse/DRILL-4147
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: amit hadke
>            Assignee: Aman Sinha
>
> A User noticed that running select  from a single directory is much faster than union
all on two directories.
> (https://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/#comment-2349732267)

> It seems like UNION ALL operator doesn't parallelize sub scans (its using SINGLETON for
distribution type). Everything is ran in single fragment.
> We may have to use SubsetTransformer in UnionAllPrule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message