drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Gilmore (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3333) Add support for auto-partitioning in parquet writer
Date Wed, 24 Jun 2015 09:36:05 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14599175#comment-14599175
] 

Adam Gilmore commented on DRILL-3333:
-------------------------------------

I'm curious - is this not redundant with DRILL-1950?  DRILL-1950 is a more comprehensive pushdown
filtering for Parquet, so will not only prune out for single values like the above patch,
but for all kinds of ranges etc.

> Add support for auto-partitioning in parquet writer
> ---------------------------------------------------
>
>                 Key: DRILL-3333
>                 URL: https://issues.apache.org/jira/browse/DRILL-3333
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Steven Phillips
>            Assignee: Steven Phillips
>         Attachments: DRILL-3333.patch, DRILL-3333.patch, DRILL-3333_2015-06-22_15:22:11.patch,
DRILL-3333_2015-06-23_17:38:32.patch
>
>
> When a table is created with a partition by clause, the parquet writer will create separate
files for the different partition values. The data will first be sorted by the partition keys,
and the parquet writer will create new file when it encounters a new value for the partition
columns.
> When data is queried against the data that was created this way, partition pruning will
work if the filter contains a partition column. And unlike directory based partitioning, no
view is required, nor is it necessary to reference the dir* column names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message