hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Dere (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17113) Duplicate bucket files can get written to table by runaway task
Date Mon, 17 Jul 2017 22:13:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090666#comment-16090666
] 

Jason Dere commented on HIVE-17113:
-----------------------------------

Talked to [~ashutoshc] and [~sseth] about this. According to Sid this is normally handled
in MR using the OutputCommitter. However Ashutosh mentioned that Hive does not use the Hadoop
OutputCommitter functionality and instead tries to handle duplicate task attempts by itself
- thus the call to Utilities.removeTempOrDuplicateFiles().

A couple of solutions to this on the Hive side:
1) Changing Hive to properly use the OutputCommitter
2) Utiltiies.mvFileToFinalPath() should call Utilities.removeTempOrDuplicateFiles() after
renaming the temp directory rather than before renaming. This is basically swapping the order
of steps 6 and 8 in the Jira description, within Utilities.mvFileToFinalPath().

Gonna try to do option 2 as it looks like a simpler fix.

> Duplicate bucket files can get written to table by runaway task
> ---------------------------------------------------------------
>
>                 Key: HIVE-17113
>                 URL: https://issues.apache.org/jira/browse/HIVE-17113
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>
> Saw a table get a duplicate bucket file from a Hive query. It looks like the following
happened:
> 1. Task attempt A_0 starts,but then stops making progress
> 2. The job was running with speculative execution on, and task attempt A_1 is started
> 3. Task attempt A_1 finishes execution and saves its output to the temp directory.
> 5. A task kill is sent to A_0, though this does appear to actually kill A_0
> 6. The job for the query finishes and Utilities.mvFileToFinalPath() calls Utilities.removeTempOrDuplicateFiles()
to check for duplicate bucket files
> 7. A_0 (still running) finally finishes and saves its file to the temp directory. At
this point we now have duplicate bucket files - oops!
> 8. Utilities.removeTempOrDuplicateFiles() moves the temp directory to the final location,
where it is later moved to the partition directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message