hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
Date Mon, 21 Apr 2014 18:22:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975824#comment-13975824
] 

Xuefu Zhang commented on HIVE-6835:
-----------------------------------

I think that's pretty much what you need to do. While #2 may touch many files, it's fairly
safe as #1 guarantees that the same code will be exercised. There isn't much API change. You
add one with default implementation and deprecate the old one. In #3, you have both property
sets and do whatever you need for Avro.

> Reading of partitioned Avro data fails if partition schema does not match table schema
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-6835
>                 URL: https://issues.apache.org/jira/browse/HIVE-6835
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Anthony Hsu
>            Assignee: Anthony Hsu
>         Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch
>
>
> To reproduce:
> {code}
> create table testarray (a array<string>);
> load data local inpath '/home/ahsu/test/array.txt' into table testarray;
> # create partitioned Avro table with one array column
> create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
with serdeproperties ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
"record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} } ] }')  STORED
as INPUTFORMAT  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
> insert into table avroarray partition(y=1) select * from testarray;
> # add an int column with a default value of 0
> alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
"record", "fields": [ {"name":"intfield","type":"int","default":0},{ "name":"a", "type":{"type":"array","items":"string"}
} ] }');
> # fails with ClassCastException
> select * from avroarray;
> {code}
> The select * fails with:
> {code}
> Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector
cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message