hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Navis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7217) Inner join query fails in the reducer when join key file is spilled to tmp by RowContainer
Date Fri, 13 Jun 2014 03:17:02 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030218#comment-14030218
] 

Navis commented on HIVE-7217:
-----------------------------

If you build hive from source with -Phadoop-1, it would not be happened. I think this is fixed
in HIVE-6037 but it's failed to get in to trunk. Will make separate patch for this.

> Inner join query fails in the reducer when join key file is spilled to tmp by RowContainer
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7217
>                 URL: https://issues.apache.org/jira/browse/HIVE-7217
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.13.0, 0.13.1
>            Reporter: Muthu
>         Attachments: reducer.log
>
>
> {code}
> SELECT T1.userid, T2.video_title FROM videoview T1 JOIN video T2 ON T1.video_id = T2.video_id
WHERE T1.hourid=389567
> hive> show create table video;
> OK
> CREATE  TABLE `video`(
>   `video_id` int,
>   `video_title` string,
> )
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\t'
>   LINES TERMINATED BY '\n'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://elsharpynn001.prod.hulu.com:8020/hive/warehouse/video'
> TBLPROPERTIES (
>   'numPartitions'='0',
>   'numFiles'='1',
>   'last_modified_by'='hadoop',
>   'last_modified_time'='1336446601',
>   'COLUMN_STATS_ACCURATE'='true',
>   'transient_lastDdlTime'='1402514051',
>   'numRows'='0',
>   'totalSize'='586773666',
>   'rawDataSize'='0')
> Time taken: 0.249 seconds, Fetched: 98 row(s)
> {code}
> The reducer fails with the following exception:
> {code}
> 2014-06-11 12:32:39,051 INFO org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table
0 has 16000 rows for join key [663184]
> 2014-06-11 12:32:39,061 INFO org.apache.hadoop.hive.ql.exec.persistence.RowContainer:
RowContainer created temp file /mnt/volume2/mapred/local/taskTracker/muthu.nivas/jobcache/job_201405301214_170634/attempt_201405301214_170634_r_000000_0/work/tmp/hive-rowcontainer413460656723947992/RowContainer1053550561043043830.tmp
> 2014-06-11 12:32:39,237 INFO org.apache.hadoop.mapred.FileInputFormat: Total input paths
to process : 2
> 2014-06-11 12:32:39,299 WARN org.apache.hadoop.mapred.Child: Error running child
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
hdfs://elsharpynn001.prod.hulu.com:8020/hive/warehouse/video/video_20140611071209 not a SequenceFile
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
> 	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: hdfs://elsharpynn001.prod.hulu.com:8020/hive/warehouse/video/video_20140611071209
not a SequenceFile
> 	at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:237)
> 	at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74)
> 	at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:644)
> 	at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
> 	at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:216)
> 	... 7 more
> Caused by: java.io.IOException: hdfs://elsharpynn001.prod.hulu.com:8020/hive/warehouse/video/video_20140611071209
not a SequenceFile
> 	at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1805)
> 	at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1765)
> 	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1714)
> 	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1728)
> 	at org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:43)
> 	at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:59)
> 	at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:226)
> 	... 12 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message