hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-446) Container killed before hprof dumps profile.out
Date Mon, 04 Mar 2013 14:47:14 GMT

    [ https://issues.apache.org/jira/browse/YARN-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592246#comment-13592246
] 

Jason Lowe commented on YARN-446:
---------------------------------

IMO the AM should always allow the task attempt time to exit successfully on its own rather
than sending it a kill signal that races with the normal shutdown of the task attempt.  This
is very similar to the race between the AM shutting down after unregistering with the RM and
the subsequent kill being sent by the RM which was mitigated by MAPREDUCE-4157.  This would
also help eliminate the many confusing "Container killed by ApplicationMaster" messages that
are appearing in task attempt diagnostics for tasks that are otherwise operating normally.
                
> Container killed before hprof dumps profile.out
> -----------------------------------------------
>
>                 Key: YARN-446
>                 URL: https://issues.apache.org/jira/browse/YARN-446
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.0.3-alpha
>            Reporter: Radim Kolar
>
> If there is profiling enabled for mapper or reducer then hprof dumps profile.out at process
exit. It is dumped after task signaled to AM that work is finished.
> AM kills container with finished work without waiting for hprof to finish dumps. If hprof
is dumping larger outputs (such as with depth=4 while depth=3 works) , it could not finish
dump in time before being killed making entire dump unusable because cpu and heap stats are
missing.
> There needs to be better delay before container is killed if profiling is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message