hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-1127) Speculative Execution and output of Reduce tasks
Date Thu, 15 Mar 2007 23:12:09 GMT
Speculative Execution and output of Reduce tasks

                 Key: HADOOP-1127
                 URL: https://issues.apache.org/jira/browse/HADOOP-1127
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
    Affects Versions: 0.12.0
            Reporter: Arun C Murthy
         Assigned To: Arun C Murthy
             Fix For: 0.13.0

We've recently seen instances where jobs run with 'speculative execution' tend to be quite
unstable and fail with *AlreadyBeingCreatedException* noticed at the NameNode. Also potentially
we could have hairy situations where a failed Reduce tasks's output could clash with a successful
task's (same tip) output.

As it exists, speculative execution relies on the PhasedFileSystem which creates a temp output
file and then on task-completion that file is 'moved' to its final position via a call to
PhasedFileSystem.commit from ReduceTask.run(). This has lead to issues such as the above.


Basically the idea is to due this uniformly for all Reduce tasks i.e. all reducers create
temp files and then have a serialized 'commit' done by the JobTracker which moves the temp
file to it's final position. 

We create the temp file in the job's output directory itself:
<output_dir>/_<taskid> (emphasis on the leading '_')

On task completion we'll add that temp file's path to the TaskStatus and then the JobTracker
moves that file to it's final position.


This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message