hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6608) Work Preserving AM Restart for MapReduce
Date Mon, 18 Jan 2016 13:17:40 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105267#comment-15105267

Junping Du commented on MAPREDUCE-6608:

Thanks [~srikanth.sampath] and [~raju.bairishetti] for proposing this JIRA and upload a design
document. This work could be a significant improvement to our MapReduce framework reliability.

Go through the current design doc, I think store new attempt address for MR AM in zookeeper
could have scalability issues in case MR job has massive running tasks (ten thousands or more).
I think it could be better to store/get new MR AM location from HDFS which has better scalability.

Also, in my understanding, Yarn Service Registry may not best fit for this case. CC [~stevel@apache.org]
who is author of YSR.
I could propose another version of design with more details in next few days in case we haven't
started the development work yet.

> Work Preserving AM Restart for MapReduce
> ----------------------------------------
>                 Key: MAPREDUCE-6608
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Srikanth Sampath
>            Assignee: Raju Bairishetti
>         Attachments: WorkPreservingMRAppMaster.pdf
> Providing a framework for work preserving AM is achieved in [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489].
 We would like to take advantage of this for MapReduce(MR) applications.  There are some challenges
which have been described in the attached document and few options discussed.  We solicit
feedback from the community.

This message was sent by Atlassian JIRA

View raw message