hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16156) FileSinkOperator should delete existing output target when renaming
Date Sat, 11 Mar 2017 04:25:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906063#comment-15906063
] 

Hive QA commented on HIVE-16156:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12857437/HIVE-16156.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10339 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=217)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4083/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4083/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4083/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12857437 - PreCommit-HIVE-Build

> FileSinkOperator should delete existing output target when renaming
> -------------------------------------------------------------------
>
>                 Key: HIVE-16156
>                 URL: https://issues.apache.org/jira/browse/HIVE-16156
>             Project: Hive
>          Issue Type: Bug
>          Components: Operators
>    Affects Versions: 1.1.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>         Attachments: HIVE-16156.1.patch, HIVE-16156.2.patch, HIVE-16156.patch
>
>
> If a task get killed (for whatever a reason) after it completes the renaming the temp
output to final output during commit, subsequent task attempts will fail when renaming because
of the existence of the target output. This can happen, however rarely.
> {code}
> Job failed with org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output
from: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
> FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask.
java.util.concurrent.ExecutionException: Exception thrown by job
> 	at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:311)
> 	at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:316)
> 	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382)
> 	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 306
in stage 5.0 failed 4 times, most recent failure: Lost task 306.4 in stage 5.0 (TID 2956,
hadoopworker1444-sjc1.prod.uber.internal): java.lang.IllegalStateException: Hit error while
closing operators - failing tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable
to rename output from: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
> 	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202)
> 	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
> 	at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106)
> 	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> 	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> 	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> 	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
> 	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
> 	at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
> 	at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
> 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output
from: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_task_tmp.-ext-10001/_tmp.000306_0
to: hdfs://nameservice1/tmp/hive-staging/xuefu_hive_2017-03-08_02-55-25_355_1482508192727176207-1/_tmp.-ext-10001/000306_0
> 	at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:227)
> 	at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$200(FileSinkOperator.java:133)
> 	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1019)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> 	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> 	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:179)
> 	... 15 more
> {code}
> Hive should check the existence of the target output and delete it before renaming.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message