hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4216) Container logs not shown for newly assigned containers after NM recovery
Date Thu, 01 Oct 2015 15:01:27 GMT

    [ https://issues.apache.org/jira/browse/YARN-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939931#comment-14939931

Jason Lowe commented on YARN-4216:

Not necessarily.  A nodemanager could also be shutting down due to an uncaught exception,
crash, etc. or an admin could be shutting down a nodemanager without an intention of restarting
it.  That's why YARN-1362 was done, so we can explicitly tell the nodemanager whether or not
the NM is under supervision and likely to restart.  If the NM is not under supervision then
kill -9 should be used for the restart scenario and yarn --daemon stop nodemanager used for
shutting it down.

> Container logs not shown for newly assigned containers  after NM  recovery
> --------------------------------------------------------------------------
>                 Key: YARN-4216
>                 URL: https://issues.apache.org/jira/browse/YARN-4216
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: log-aggregation, nodemanager
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>         Attachments: NMLog, ScreenshotFolder.png, yarn-site.xml
> Steps to reproduce
> # Start 2 nodemanagers  with NM recovery enabled
> # Submit pi job with 20 maps 
> # Once 5 maps gets completed in NM 1 stop NM (yarn daemon stop nodemanager)
> (Logs of all completed container gets aggregated to HDFS)
> # Now start  the NM1 again and wait for job completion
> *The newly assigned container logs on NM1 are not shown*
> *hdfs log dir state*
> # When logs are aggregated to HDFS during stop its with NAME (localhost_38153)
> # On log aggregation after starting NM the newly assigned container logs gets uploaded
with name  (localhost_38153.tmp) 
> History server the logs are now shown for new task attempts

This message was sent by Atlassian JIRA

View raw message