hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <>
Subject [jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
Date Fri, 18 Apr 2014 02:41:15 GMT


Xuefu Zhang commented on HIVE-6835:

[~erwaman] Thanks for the explanation. Now I see where the problem is. SERDEPROPERTIES and
TBLPROPERTIES are for different purpose. I'm curious why user would put avro.schema.literal
in the serde properties, as this is table specific and it should be put in TBLPROPERTIES.
SERDEPROPERTIES, on the other hand, is used to control serde behavior (plugin level instead
of table level), such as field delimiter which doesn't necessary vary from table to table.
If you check AvroSerde documentation, schema is specified in TBLPROPERTIES.
Thus, it seems that this fix is for an invalid use case. What's your thought on this?

> Reading of partitioned Avro data fails if partition schema does not match table schema
> --------------------------------------------------------------------------------------
>                 Key: HIVE-6835
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Anthony Hsu
>            Assignee: Anthony Hsu
>         Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch
> To reproduce:
> {code}
> create table testarray (a array<string>);
> load data local inpath '/home/ahsu/test/array.txt' into table testarray;
> # create partitioned Avro table with one array column
> create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
with serdeproperties ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
"record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} } ] }')  STORED
> insert into table avroarray partition(y=1) select * from testarray;
> # add an int column with a default value of 0
> alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
"record", "fields": [ {"name":"intfield","type":"int","default":0},{ "name":"a", "type":{"type":"array","items":"string"}
} ] }');
> # fails with ClassCastException
> select * from avroarray;
> {code}
> The select * fails with:
> {code}
> Failed with exception org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector
cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> {code}

This message was sent by Atlassian JIRA

View raw message