drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3948) Partitioning columns of a Parquet table should be made visible to end user
Date Sun, 18 Oct 2015 20:05:05 GMT
Aman Sinha created DRILL-3948:
---------------------------------

             Summary: Partitioning columns of a Parquet table should be made visible to end
user
                 Key: DRILL-3948
                 URL: https://issues.apache.org/jira/browse/DRILL-3948
             Project: Apache Drill
          Issue Type: Improvement
          Components: Metadata, Query Planning & Optimization
    Affects Versions: 1.2.0
            Reporter: Aman Sinha


For Parquet files, Drill can do partition pruning for filter conditions on a column which
satisfies the following criteria: 
  Each parquet file has a single value of that column. The parquet metadata is examined for
the min and max values of that column and if they are the same, the column is considered a
partitioning column. 

  When CTAS auto-partition is used, the above criteria is enforced, but even for files created
through external methods could satisfy the criteria.  

It is difficult for users to know what exactly are the candidate partitioning columns in the
table.  We should provide this information in a user friendly way:  for instance: 
  - special  'show partition columns for <table>'  command
  - In the Explain plan, show partition columns for the table in Scan node
 More options should be discussed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message