hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aihua Xu (JIRA)" <>
Subject [jira] [Commented] (HIVE-15054) Hive insertion query execution fails on Hive on Spark
Date Wed, 26 Oct 2016 12:49:59 GMT


Aihua Xu commented on HIVE-15054:

[~lirui] Thanks for taking a look. It would be hard to repro. It depends on which state the
first executor is when it's aborted or dies. You will see such issue when the task is done
with the writing the data to a tmp file and renaming to the file tmp file but spark kills
such task in your case or the executor loses the connection at that time. The case I have
seen is,  the connection to the executor times out but the executor is almost done with its
work (the result is finished writing and renamed to the final tmp file and only thing left
is to report to the driver that the task is done).  

 If the rename doesn't happen, then you won't see such issue. 

> Hive insertion query execution fails on Hive on Spark
> -----------------------------------------------------
>                 Key: HIVE-15054
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 2.0.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-15054.1.patch
> The query of {{insert overwrite table tbl1}} sometimes will fail with the following errors.
Seems we are constructing taskAttemptId with partitionId which is not unique if there are
multiple attempts.
> {noformat}
> ava.lang.IllegalStateException: Hit error while closing operators - failing tree: org.apache.hadoop.hive.ql.metadata.HiveException:
Unable to rename output from: hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_task_tmp.-ext-10002/_tmp.002148_0
to: hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_tmp.-ext-10002/002148_0
> at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(
> at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(
> at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(
> at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
> {noformat}

This message was sent by Atlassian JIRA

View raw message