hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmed Radwan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4284) Allow setting yarn.nodemanager.delete.debug-delay-sec on a per-job basis
Date Fri, 25 May 2012 01:23:47 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282888#comment-13282888
] 

Ahmed Radwan commented on MAPREDUCE-4284:
-----------------------------------------

Arun & Tucu,

What about the following approach:

1- We'll add a new boolean property (e.g. mapreduce.job.debug.delete), its default value will
be true. This property can be set by the user when submitting the job.

2- Then, in the NM DeletionService, the effective delay before deleting will be:

{code}
int effDelay = jobConf.getBoolean("mapreduce.job.debug.delete", true)? 0 : conf.getInt(YarnConfiguration.DEBUG_NM_DELETE_DELAY_SEC,
0);
{code}

This will guarantee that the value set in the NM property yarn.nodemanager.delete.debug-delay-sec
will only apply to jobs where the user explicitly set mapreduce.job.debug.delete to false.

This approach will address Arun's concerns about misusing the property and filling the NM
disks, and at the same time, prevent flooding the nodes with files regardless if the user
submitting the job cares or not (as Tucu highlighted).

What do you think?

                
> Allow setting yarn.nodemanager.delete.debug-delay-sec on a per-job basis
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4284
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4284
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Ahmed Radwan
>            Assignee: Ahmed Radwan
>
> The yarn.nodemanager.delete.debug-delay-sec property is helpful in debugging jobs (inspecting
container logs/local dirs after the job finishes). Currently it is a nodemanager property
and changing it requires restarting the nodemanager. In a production cluster this can be a
real problem. It is better to have this property set on a per-job basis and not requiring
the restart of nodemanagers. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message