hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5607) Backport MAPREDUCE-5086 - MR app master deletes staging dir when sent a reboot command from the RM
Date Tue, 19 Nov 2013 21:19:23 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826951#comment-13826951
] 

Jason Lowe commented on MAPREDUCE-5607:
---------------------------------------

Thanks for the patch, Jon.  Comments:

- This patch adds a new JOB_UPDATED_NODES event which is unrelated to the change in MAPREDUCE-5086.
 Nothing generates that event.
- In branch-0.23, the number of AM attempts is set cluster-wide and not per-app as is the
case in 2.x.  Therefore it's probably not appropriate to add MRJobConfig.DEFAULT_MR_AM_MAX_ATTEMPTS.
 Instead we should use YarnConfiguration.DEFAULT_RM_AM_MAX_RETRIES to match what the rest
of the code is doing in branch-0.23.

> Backport MAPREDUCE-5086 - MR app master deletes staging dir when sent a reboot command
from the RM
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5607
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5607
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>    Affects Versions: 0.23.9
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: MAPREDUCE-5607-branch-0.23.patch
>
>
> If the RM is restarted when the MR job is running, then it sends a reboot command to
the job. The job ends up deleting the staging dir and that causes the next attempt to fail.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message