drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6199) Filter push down doesn't work with more than one nested subqueries
Date Fri, 16 Mar 2018 16:06:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402102#comment-16402102
] 

ASF GitHub Bot commented on DRILL-6199:
---------------------------------------

Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1152#discussion_r175136589
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterItemStarReWriterRule.java
---
    @@ -54,83 +44,189 @@
     import static org.apache.drill.exec.planner.logical.FieldsReWriterUtil.FieldsReWriter;
     
     /**
    - * Rule will transform filter -> project -> scan call with item star fields in
filter
    - * into project -> filter -> project -> scan where item star fields are pushed
into scan
    - * and replaced with actual field references.
    + * Rule will transform item star fields in filter and replaced with actual field references.
      *
      * This will help partition pruning and push down rules to detect fields that can be
pruned or push downed.
      * Item star operator appears when sub-select or cte with star are used as source.
      */
    -public class DrillFilterItemStarReWriterRule extends RelOptRule {
    +public class DrillFilterItemStarReWriterRule {
     
    -  public static final DrillFilterItemStarReWriterRule INSTANCE = new DrillFilterItemStarReWriterRule(
    -      RelOptHelper.some(Filter.class, RelOptHelper.some(Project.class, RelOptHelper.any(
TableScan.class))),
    -      "DrillFilterItemStarReWriterRule");
    +  public static final DrillFilterItemStarReWriterRule.ProjectOnScan PROJECT_ON_SCAN =
new ProjectOnScan(
    +          RelOptHelper.some(DrillProjectRel.class, RelOptHelper.any(DrillScanRel.class)),
    +          "DrillFilterItemStarReWriterRule.ProjectOnScan");
     
    -  private DrillFilterItemStarReWriterRule(RelOptRuleOperand operand, String id) {
    -    super(operand, id);
    -  }
    +  public static final DrillFilterItemStarReWriterRule.FilterOnScan FILTER_ON_SCAN = new
FilterOnScan(
    +      RelOptHelper.some(DrillFilterRel.class, RelOptHelper.any(DrillScanRel.class)),
    +      "DrillFilterItemStarReWriterRule.FilterOnScan");
     
    -  @Override
    -  public void onMatch(RelOptRuleCall call) {
    -    Filter filterRel = call.rel(0);
    -    Project projectRel = call.rel(1);
    -    TableScan scanRel = call.rel(2);
    +  public static final DrillFilterItemStarReWriterRule.FilterOnProject FILTER_ON_PROJECT
= new FilterOnProject(
    --- End diff --
    
    Done.


> Filter push down doesn't work with more than one nested subqueries
> ------------------------------------------------------------------
>
>                 Key: DRILL-6199
>                 URL: https://issues.apache.org/jira/browse/DRILL-6199
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.13.0
>            Reporter: Anton Gozhiy
>            Assignee: Arina Ielchiieva
>            Priority: Major
>             Fix For: 1.14.0
>
>         Attachments: DRILL_6118_data_source.csv
>
>
> *Data set:*
> The data is generated used the attached file: *DRILL_6118_data_source.csv*
> Data gen commands:
> {code:sql}
> create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d1` (c1, c2, c3, c4,
c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] c3, columns[3] c4, columns[4]
c5 from dfs.tmp.`DRILL_6118_data_source.csv` where columns[0] in (1, 3);
> create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d2` (c1, c2, c3, c4,
c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] c3, columns[3] c4, columns[4]
c5 from dfs.tmp.`DRILL_6118_data_source.csv` where columns[0]=2;
> create table dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders/d3` (c1, c2, c3, c4,
c5) as select cast(columns[0] as int) c1, columns[1] c2, columns[2] c3, columns[3] c4, columns[4]
c5 from dfs.tmp.`DRILL_6118_data_source.csv` where columns[0]>3;
> {code}
> *Steps:*
> # Execute the following query:
> {code:sql}
> explain plan for select * from (select * from (select * from dfs.tmp.`DRILL_6118_parquet_partitioned_by_folders`))
where c1<3
> {code}
> *Expected result:*
> numFiles=2, numRowGroups=2, only files from the folders d1 and d2 should be scanned.
> *Actual result:*
> Filter push down doesn't work:
> numFiles=3, numRowGroups=3, scanning from all files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message