hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brock Noland (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-8654) CBO: parquet_ctas test returns incorrect results
Date Wed, 29 Oct 2014 20:53:33 GMT

    [ https://issues.apache.org/jira/browse/HIVE-8654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188978#comment-14188978
] 

Brock Noland commented on HIVE-8654:
------------------------------------

The only thing I could thing of is if the column names written to the file are getting messed
up. Parquet does column resolution by name. You can view the contents of a parquet file with
the parquet dump command.

> CBO: parquet_ctas test returns incorrect results
> ------------------------------------------------
>
>                 Key: HIVE-8654
>                 URL: https://issues.apache.org/jira/browse/HIVE-8654
>             Project: Hive
>          Issue Type: Sub-task
>          Components: CBO
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>             Fix For: 0.15.0
>
>
> I am investigating right now. 
> The issue is specific to Parquet:
> {noformat}
> set hive.cbo.enable=true;
> drop table staging;
> drop table parquet_ctas;
> create table staging (key int, value string) stored as textfile;
> insert into table staging select * from src order by key limit 10;
> select * from staging;
> create table parquet_ctas stored as parquet as select * from staging;
> select * from parquet_ctas;
> create table orc_ctas stored as orc as select * from staging;
> select * from orc_ctas;
> create table txt_ctas stored as textfile as select * from staging;
> select * from txt_ctas;
> {noformat}
> The parquet query returns all NULLs with CBO on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message