hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sharad Agarwal (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3512) Batch jobHistory disk flushes
Date Fri, 09 Dec 2011 20:12:40 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166514#comment-13166514

Sharad Agarwal commented on MAPREDUCE-3512:

bq. Am not sure if the history event handler can handle incomplete events.
it can't. in that case recovery will be aborted and it will fallback to running all tasks
from start.

hflush has to happen at event boundaries. I knew it that hflush on every call may slow things
up, but just didn't want to do premature optimization. the simple fix is to just put the events
in bounded queue and do write + hflush when full. It has downside of some tasks being rerun
on recovery but thats completely ok.
As suggested above, additionally doing it on TaskFinishedEvent will even be more optimal.

> Batch jobHistory disk flushes
> -----------------------------
>                 Key: MAPREDUCE-3512
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3512
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.0
>            Reporter: Siddharth Seth
> The mr-am flushes each individual job history event to disk for AM recovery. The history
even handler ends up with a significant backlog for tests like MAPREDUCE-3402. 
> History events could be batched up based on num records / time / TaskFinishedEvents to
reduce the number of DFS writes - with the potential drawback of having to rerun some tasks
during AM recovery.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message