hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-4329) HCatalog clients can't write to AvroSerde backed tables
Date Wed, 10 Apr 2013 13:26:16 GMT
Sean Busbey created HIVE-4329:
---------------------------------

             Summary: HCatalog clients can't write to AvroSerde backed tables
                 Key: HIVE-4329
                 URL: https://issues.apache.org/jira/browse/HIVE-4329
             Project: Hive
          Issue Type: Bug
          Components: HCatalog, Serializers/Deserializers
    Affects Versions: 0.10.0
         Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive
users
            Reporter: Sean Busbey


Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following
stacktrace:

{code}
java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable
	at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
	at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
	at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
	at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
	at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
{code}

The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates
a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure
of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable.

It looks like accepting WritableComparable is what's done in the other Hive OutputFormats,
and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring
the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could
get spun into a different issue?

The underlying issue is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter,
so fixing the above will just push the failure into the placeholder RecordWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message