Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 85CCD17A32 for ; Fri, 31 Oct 2014 21:55:35 +0000 (UTC) Received: (qmail 5703 invoked by uid 500); 31 Oct 2014 21:55:35 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 5632 invoked by uid 500); 31 Oct 2014 21:55:35 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 5619 invoked by uid 500); 31 Oct 2014 21:55:35 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 5616 invoked by uid 99); 31 Oct 2014 21:55:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Oct 2014 21:55:35 +0000 Date: Fri, 31 Oct 2014 21:55:34 +0000 (UTC) From: "Alan Gates (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-8687) Support Avro through HCatalog MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192571#comment-14192571 ] Alan Gates commented on HIVE-8687: ---------------------------------- >From the hcat side these changes look fine to me. I think someone who knows the avro serde better should look at that side though, [~brocknoland] or [~xuefuz] maybe? > Support Avro through HCatalog > ----------------------------- > > Key: HIVE-8687 > URL: https://issues.apache.org/jira/browse/HIVE-8687 > Project: Hive > Issue Type: Bug > Components: HCatalog, Serializers/Deserializers > Affects Versions: 0.14.0 > Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users > Reporter: Sushanth Sowmyan > Assignee: Sushanth Sowmyan > Priority: Critical > Fix For: 0.14.0 > > Attachments: HIVE-8687.branch-0.14.patch > > > Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: > {code} > java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable > at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) > at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) > at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) > at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) > at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) > at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) > at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) > at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) > at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) > {code} > The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. > It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? > The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)