hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sushanth Sowmyan (JIRA)" <>
Subject [jira] [Created] (HIVE-8687) Support Avro through HCatalog
Date Fri, 31 Oct 2014 21:24:34 GMT
Sushanth Sowmyan created HIVE-8687:

             Summary: Support Avro through HCatalog
                 Key: HIVE-8687
             Project: Hive
          Issue Type: Bug
          Components: HCatalog, Serializers/Deserializers
    Affects Versions: 0.14.0
         Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive
            Reporter: Sushanth Sowmyan
            Assignee: David Chen
            Priority: Critical
             Fix For: 0.14.0

Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following

java.lang.ClassCastException: cannot be cast to
	at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(
	at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(
	at org.apache.hcatalog.pig.HCatBaseStorer.putNext(
	at org.apache.hcatalog.pig.HCatStorer.putNext(
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(

The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates
a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure
of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable.

It looks like accepting WritableComparable is what's done in the other Hive OutputFormats,
and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring
the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could
get spun into a different issue?

The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat
doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure
into the placeholder RecordWriter.

This message was sent by Atlassian JIRA

View raw message