drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4392) CTAS with partition writes an internal field into generated parquet files
Date Fri, 19 Feb 2016 17:56:18 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154553#comment-15154553
] 

ASF GitHub Bot commented on DRILL-4392:
---------------------------------------

Github user jinfengni commented on the pull request:

    https://github.com/apache/drill/pull/383#issuecomment-186332099
  
    Right. The planner could not remove that internal field by projection removal, since Writer
operator has to use that field. It's the writer's job to exclude that field from the generated
files.  


> CTAS with partition writes an internal field into generated parquet files
> -------------------------------------------------------------------------
>
>                 Key: DRILL-4392
>                 URL: https://issues.apache.org/jira/browse/DRILL-4392
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Jinfeng Ni
>            Assignee: Steven Phillips
>            Priority: Blocker
>
> On today's master branch:
> {code}
> select * from sys.version;
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------+----------------------------+
> |     version     |                 commit_id                 |                     
     commit_message                            |        commit_time         |   build_email
  |         build_time         |
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------+----------------------------+
> | 1.5.0-SNAPSHOT  | 9a3a5c4ff670a50a49f61f97dd838da59a12f976  | DRILL-4382: Remove dependency
on drill-logical from vector package  | 16.02.2016 @ 11:58:48 PST  | jni@apache.org  | 16.02.2016
@ 17:40:44 PST  |
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------
> {code}
> Parquet table created by Drill's CTAS statement has one internal field "P_A_R_T_I_T_I_O_N_C_O_M_P_A_R_A_T_O_R".
  This additional field would not impact non-star query, but would cause incorrect result
for star query.
> {code}
> use dfs.tmp;
> create table nation_ctas partition by (n_regionkey) as select * from cp.`tpch/nation.parquet`;
> select * from dfs.tmp.nation_ctas limit 6;
> +--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------+----------------------------------------+
> | n_nationkey  |     n_name     | n_regionkey  |                                    
               n_comment                                                    | P_A_R_T_I_T_I_O_N_C_O_M_P_A_R_A_T_O_R
 |
> +--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------+----------------------------------------+
> | 5            | ETHIOPIA       | 0            | ven packages wake quickly. regu    
                                                                            | true       
                           |
> | 15           | MOROCCO        | 0            | rns. blithely bold courts among the
closely regular packages use furiously bold platelets?                      | false      
                           |
> | 14           | KENYA          | 0            |  pending excuses haggle furiously deposits.
pending, express pinto beans wake fluffily past t                   | false              
                   |
> | 0            | ALGERIA        | 0            |  haggle. carefully final deposits detect
slyly agai                                                             | false           
                      |
> | 16           | MOZAMBIQUE     | 0            | s. ironic, unusual asymptotes wake blithely
r                                                                   | false              
                   |
> | 24           | UNITED STATES  | 1            | y final packages. slow foxes cajole
quickly. quickly silent platelets breach ironic accounts. unusual pinto be  | true
> {code}
> This basically breaks all the parquet files created by Drill's CTAS with partition support.

> Also, it will also fail one of the Pre-commit functional test [1]
> [1] https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/ctas/ctas_auto_partition/general/data/drill3361.q



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message