drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Hou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5093) Explain plan shows all partitions when query scans all partitions, and filter pushdown is used with metadata caching.
Date Thu, 01 Dec 2016 23:58:58 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713470#comment-15713470
] 

Robert Hou commented on DRILL-5093:
-----------------------------------

I don't think this is a show stopper.

> Explain plan shows all partitions when query scans all partitions, and filter pushdown
is used with metadata caching.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-5093
>                 URL: https://issues.apache.org/jira/browse/DRILL-5093
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.9.0
>            Reporter: Robert Hou
>            Assignee: Jinfeng Ni
>         Attachments: 0_0_1.parquet, 0_0_2.parquet, 0_0_3.parquet, 0_0_4.parquet, 0_0_5.parquet,
drill.parquet_metadata
>
>
> This query scans all the partitions because the partitions cannot be pruned.  When metadata
caching is used, the explain plan shows all the partitions, when it should only show the parent.
> 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> explain plan for select \* from orders_parts_metadata;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      Project(*=[$0])
> 00-02        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_1.parquet],
ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_3.parquet], ReadEntryWithPath
[path=/drill/testdata/filter/orders_parts_metadata/0_0_4.parquet], ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_5.parquet],
ReadEntryWithPath [path=/drill/testdata/filter/orders_parts_metadata/0_0_2.parquet]], selectionRoot=/drill/testdata/filter/orders_parts_metadata,
numFiles=5, usedMetadataFile=true, cacheFileRoot=/drill/testdata/filter/orders_parts_metadata,
columns=[`*`]]])
> To reproduce the problem, put the attached files into a directory. Then create the metadata:
> refresh table metadata dfs.`path_to_directory`;
> For example, if you put the files in /drill/testdata/filter/orders_parts_metadata, then
run this sql command
> refresh table metadata dfs.`/drill/testdata/filter/orders_parts_metadata`;
> Here is the same query with a table that does not have metadata caching.
> 0: jdbc:drill:zk=10.10.100.186:5181/drill/rho> explain plan for select \* from orders_parts;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      Project(*=[$0])
> 00-02        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/filter/orders_parts]],
selectionRoot=maprfs:/drill/testdata/filter/orders_parts, numFiles=1, usedMetadataFile=false,
columns=[`*`]]])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message