hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11827) STORED AS AVRO fails SELECT COUNT(*) when empty
Date Fri, 18 Sep 2015 16:13:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14875856#comment-14875856
] 

Xuefu Zhang commented on HIVE-11827:
------------------------------------

My understanding is that for AVRO, schema must be provided via either a literal or url. In
other words, the schema for AVRO tables comes from external. If we are adding functionality,
then we have to ensure that the new feature works in more general cases than just an empty
table.

> STORED AS AVRO fails SELECT COUNT(*) when empty
> -----------------------------------------------
>
>                 Key: HIVE-11827
>                 URL: https://issues.apache.org/jira/browse/HIVE-11827
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>         Environment: CDH5.4.5
>            Reporter: Johndee Burks
>            Assignee: Yongzhi Chen
>            Priority: Minor
>         Attachments: HIVE-11827.1.patch
>
>
> If you create a table stored as avro and try to do select count(*) against the table
it will fail. The following shows this. Empty table in this situation is a table with no files.

> {code}
> hive> create table j2 (a int) stored as avro;
> OK
> Time taken: 1.069 seconds
> hive> select count(*) from j2;
> Query ID = johndee_20150915113434_d4fe99d4-7fb9-42fe-9b91-ad560eeacc48
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> java.io.IOException: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither avro.schema.literal
nor avro.schema.url specified, can't determine table schema
> 	at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:65)
> 	at org.apache.hadoop.hive.ql.exec.Utilities.createEmptyFile(Utilities.java:3430)
> 	at org.apache.hadoop.hive.ql.exec.Utilities.createDummyFileForEmptyPartition(Utilities.java:3463)
> 	at org.apache.hadoop.hive.ql.exec.Utilities.getInputPaths(Utilities.java:3387)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:370)
> 	at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
> 	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640)
> 	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399)
> 	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
> 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
> 	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither avro.schema.literal
nor avro.schema.url specified, can't determine table schema
> 	at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:109)
> 	at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:63)
> 	... 24 more
> Job Submission failed with exception 'java.io.IOException(org.apache.hadoop.hive.serde2.avro.AvroSerdeException:
Neither avro.schema.literal nor avro.schema.url specified, can't determine table schema)'
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message