hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
Date Thu, 09 Mar 2017 16:34:38 GMT

     [ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xuefu Zhang updated HIVE-16156:
-------------------------------
    Description: 
If a task get killed (for whatever a reason) after it completes the renaming the temp output
to final output during commit, subsequent task attempts will fail when renaming because of
the existence of the target output. This can happen, however rarely.
{code}
Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output
from: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask.
java.util.concurrent.ExecutionException: Exception thrown by job
	at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311)
	at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316)
	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382)
	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 306 in
stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in stage 5.0 (TID 2956, hadoopworker1444-sjc1.prod.uber.internal):
java.lang.IllegalStateException: Hit error while closing operators - failing tree: org.apache.hadoop.hive.ql.metadata.HiveException:
Unable to rename output from: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202)
	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
	at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106)
	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
	at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
	at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output from:
hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$200(FileSinkOperator.java:133)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1019)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:179)
	... 15 more
{code}
Hive should check the existence of the target output and delete it before renaming.

  was:
If a task get killed (for whatever a reason) after it completes the renaming the temp output
to final output during commit, subsequent task attempts will fail when renaming because of
the existence of the target output. This can happen, however rarely.

Hive should check the existence of the target output and delete it before renaming.


> FileSinkOperator should delete existing output target when renaming
> -------------------------------------------------------------------
>
>                 Key: HIVE-16156
>                 URL: https://issues.apache.org/jira/browse/HIVE-16156
>             Project: Hive
>          Issue Type: Bug
>          Components: Operators
>    Affects Versions: 1.1.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>         Attachments: HIVE-16156.patch
>
>
> If a task get killed (for whatever a reason) after it completes the renaming the temp
output to final output during commit, subsequent task attempts will fail when renaming because
of the existence of the target output. This can happen, however rarely.
> {code}
> Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output
from: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
> FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask.
java.util.concurrent.ExecutionException: Exception thrown by job
> 	at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311)
> 	at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316)
> 	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382)
> 	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 306
in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in stage 5.0 (TID 2956,
hadoopworker1444-sjc1.prod.uber.internal): java.lang.IllegalStateException: Hit error while
closing operators - failing tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable
to rename output from: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
> 	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202)
> 	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
> 	at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106)
> 	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> 	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> 	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> 	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
> 	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
> 	at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
> 	at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
> 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output
from: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
> 	at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227)
> 	at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$200(FileSinkOperator.java:133)
> 	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1019)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> 	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:179)
> 	... 15 more
> {code}
> Hive should check the existence of the target output and delete it before renaming.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message