hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17113) Duplicate bucket files can get written to table by runaway task
Date Tue, 18 Jul 2017 03:19:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091048#comment-16091048
] 

Hive QA commented on HIVE-17113:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877705/HIVE-17113.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11065 tests executed
*Failed tests:*
{noformat}
TestSSL - did not produce a TEST-*.xml file (likely timed out) (batchId=224)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
(batchId=238)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
(batchId=167)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_2]
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_use_op_stats]
(batchId=167)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_use_ts_stats_for_mapjoin]
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
(batchId=167)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=233)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[skewjoin] (batchId=110)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
(batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6070/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6070/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6070/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12877705 - PreCommit-HIVE-Build

> Duplicate bucket files can get written to table by runaway task
> ---------------------------------------------------------------
>
>                 Key: HIVE-17113
>                 URL: https://issues.apache.org/jira/browse/HIVE-17113
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>         Attachments: HIVE-17113.1.patch
>
>
> Saw a table get a duplicate bucket file from a Hive query. It looks like the following
happened:
> 1. Task attempt A_0 starts,but then stops making progress
> 2. The job was running with speculative execution on, and task attempt A_1 is started
> 3. Task attempt A_1 finishes execution and saves its output to the temp directory.
> 5. A task kill is sent to A_0, though this does appear to actually kill A_0
> 6. The job for the query finishes and Utilities.mvFileToFinalPath() calls Utilities.removeTempOrDuplicateFiles()
to check for duplicate bucket files
> 7. A_0 (still running) finally finishes and saves its file to the temp directory. At
this point we now have duplicate bucket files - oops!
> 8. Utilities.removeTempOrDuplicateFiles() moves the temp directory to the final location,
where it is later moved to the partition directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message