Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Mon, 4 Mar 2013 14:47:14 +0000 (UTC)
From: "Jason Lowe (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12635029.1362324362404.375343.1362408434092@arcas>
In-Reply-To: <JIRA.12635029.1362324362404@arcas>
References: <JIRA.12635029.1362324362404@arcas>
Subject: [jira] [Commented] (YARN-446) Container killed before hprof dumps
 profile.out
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/YARN-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592246#comment-13592246 ] 

Jason Lowe commented on YARN-446:
---------------------------------

IMO the AM should always allow the task attempt time to exit successfully on its own rather than sending it a kill signal that races with the normal shutdown of the task attempt.  This is very similar to the race between the AM shutting down after unregistering with the RM and the subsequent kill being sent by the RM which was mitigated by MAPREDUCE-4157.  This would also help eliminate the many confusing "Container killed by ApplicationMaster" messages that are appearing in task attempt diagnostics for tasks that are otherwise operating normally.
                
> Container killed before hprof dumps profile.out
> -----------------------------------------------
>
>                 Key: YARN-446
>                 URL: https://issues.apache.org/jira/browse/YARN-446
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.0.3-alpha
>            Reporter: Radim Kolar
>
> If there is profiling enabled for mapper or reducer then hprof dumps profile.out at process exit. It is dumped after task signaled to AM that work is finished.
> AM kills container with finished work without waiting for hprof to finish dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 works) , it could not finish dump in time before being killed making entire dump unusable because cpu and heap stats are missing.
> There needs to be better delay before container is killed if profiling is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira