hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anthony Hsu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
Date Sat, 19 Apr 2014 00:55:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974670#comment-13974670
] 

Anthony Hsu commented on HIVE-6835:
-----------------------------------

[~xuefuz] and [~ashutoshc], just to clarify, is this the alternative solution you're proposing?:
# Add
{code}
public void initialize(Configuration configuration, Properties tableProperties, Properties
partitionProperties) throws SerDeException;
{code}
to AbstractSerDe and provide a default implementation that just calls {{initialize(configuration,
partitionProperties)}}
# Change all calls of {{partitionSerde.initialize(conf, partProps)}} to {{partitionSerde.initialize(conf,
tblProps, partProps)}}
# Add
{code}
@Override
public void initialize(Configuration configuration, Properties tableProperties, Properties
partitionProperties) throws SerDeException;
{code}
to AvroSerDe and provide an implementation that just uses the tableProperties

I am okay with taking this approach, though it involves a lot more code changes and will change
the public AbstractSerDe API.  Let me know what your thoughts on this approach are.

> Reading of partitioned Avro data fails if partition schema does not match table schema
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-6835
>                 URL: https://issues.apache.org/jira/browse/HIVE-6835
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Anthony Hsu
>            Assignee: Anthony Hsu
>         Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch
>
>
> To reproduce:
> {code}
> create table testarray (a array<string>);
> load data local inpath '/home/ahsu/test/array.txt' into table testarray;
> # create partitioned Avro table with one array column
> create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
with serdeproperties ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
"record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} } ] }')  STORED
as INPUTFORMAT  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
> insert into table avroarray partition(y=1) select * from testarray;
> # add an int column with a default value of 0
> alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
"record", "fields": [ {"name":"intfield","type":"int","default":0},{ "name":"a", "type":{"type":"array","items":"string"}
} ] }');
> # fails with ClassCastException
> select * from avroarray;
> {code}
> The select * fails with:
> {code}
> Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector
cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message