hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2206) The task-cleanup tasks should be optional
Date Tue, 30 Nov 2010 08:19:11 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965147#action_12965147

Joydeep Sen Sarma commented on MAPREDUCE-2206:

afaik from looking at the code - there's no requirement for the cleanup to go to the same
machine. it happens to go to the same machine because whenever a task reports failed/killed
- a slot is freed up and the JT schedules the newly created cleanup task on the same TT. but
there's no hard requirement for the same in the code and it's possible that the JT does not
schedule it on the same machine (for example if the TT was previously oversubscribed).

If the failure was because of problems with task localization (for example) - the results
are truly miserable. i have hit scenarios where two 10 min task timeouts were required to
fail a task (one for the task failure and one for it's cleanup) on a bad node.

> The task-cleanup tasks should be optional
> -----------------------------------------
>                 Key: MAPREDUCE-2206
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2206
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>    Affects Versions: 0.23.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.23.0
> For job does not use OutputCommitter.abort(), this should be able to turn off.
> This improves the latency of the job because failed tasks are often the bottleneck of
the jobs.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message