Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: mapreduce-issues@hadoop.apache.org
Date: Mon, 18 Jan 2016 13:17:40 +0000 (UTC)
From: "Junping Du (JIRA)" <jira@apache.org>
To: mapreduce-issues@hadoop.apache.org
Message-ID: <JIRA.12848899.1437745240000.139468.1453123060202@Atlassian.JIRA>
In-Reply-To: <JIRA.12848899.1437745240000@Atlassian.JIRA>
References: <JIRA.12848899.1437745240000@Atlassian.JIRA>
 <JIRA.12848899.1437745240695@arcas>
Subject: [jira] [Commented] (MAPREDUCE-6608) Work Preserving AM Restart for
 MapReduce
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/MAPREDUCE-6608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105267#comment-15105267 ] 

Junping Du commented on MAPREDUCE-6608:
---------------------------------------

Thanks [~srikanth.sampath] and [~raju.bairishetti] for proposing this JIRA and upload a design document. This work could be a significant improvement to our MapReduce framework reliability. 
Go through the current design doc, I think store new attempt address for MR AM in zookeeper could have scalability issues in case MR job has massive running tasks (ten thousands or more). I think it could be better to store/get new MR AM location from HDFS which has better scalability. 
Also, in my understanding, Yarn Service Registry may not best fit for this case. CC [~stevel@apache.org] who is author of YSR.
I could propose another version of design with more details in next few days in case we haven't started the development work yet.

> Work Preserving AM Restart for MapReduce
> ----------------------------------------
>
>                 Key: MAPREDUCE-6608
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6608
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Srikanth Sampath
>            Assignee: Raju Bairishetti
>         Attachments: WorkPreservingMRAppMaster.pdf
>
>
> Providing a framework for work preserving AM is achieved in [YARN-1489|https://issues.apache.org/jira/browse/YARN-1489].  We would like to take advantage of this for MapReduce(MR) applications.  There are some challenges which have been described in the attached document and few options discussed.  We solicit feedback from the community.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)