hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@yahoo-inc.com>
Subject Re: What's the purpose of a setup and cleanup task?
Date Sun, 04 Apr 2010 00:04:28 GMT
 From within the MR framework you can do a 2-phase commit: per-task  
and per-job.

# There is one setup task at the beginning of the job and one cleanup  
task at the end of the job. (@see OutputCommitter.setupJob adn  
OutputCommitter.(commitJob|abortJob)). These are run as separate tasks  
on any one of the tasktrackers at the start/end.

# There is also a cleanup task per-task if necessary (@see  
OutputCommitter.needsTaskCommit and OutputCommitter.(commitTask| 
abortTask)). We try to piggy-back the per-task cleanup in the same jvm  
which ran the task, however we need to spawn the per-task cleanup task  
separately if the original map/reduce task failed.

hope that helps,

On Apr 2, 2010, at 9:00 AM, psdc1978 wrote:

> Hi, I've posted this post last moth, but I haven't got a response.
> Does anyone knows this question?
> I would like to understand what's the purpose of a setup and cleanup  
> task.
> During the start-up of the job tracker, it will be assigned 2 setup  
> tasks
> and 2 cleanup tasks for map and for the reduce. My questions are:
> - What's the purpose of a setup task?
> - The setup class runs on the jobtracker side or on the tasktracker  
> side?
> - If I set around 50 map tasks to run in my hadoop example, will  
> only one
> setup task initiate all map activities? (The same question for the  
> reduce
> side)
> The above questions are also for the cleanup task.
> Regards,
> -- 
> Pedro

View raw message