hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (Assigned) (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (MAPREDUCE-3738) NM can hang during shutdown if AppLogAggregatorImpl thread dies unexpectedly
Date Wed, 22 Feb 2012 14:49:49 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jason Lowe reassigned MAPREDUCE-3738:
-------------------------------------

    Assignee: Jason Lowe
    
> NM can hang during shutdown if AppLogAggregatorImpl thread dies unexpectedly
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3738
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3738
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.1, 0.24.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: livehistdump.txt
>
>
> If an AppLogAggregator thread dies unexpectedly (e.g.: uncaught exception like OutOfMemoryError
in the case I saw) then this will lead to a hang during nodemanager shutdown.  The NM calls
AppLogAggregatorImpl.join() during shutdown to make sure log aggregation has completed, and
that method internally waits for an atomic boolean to be set by the log aggregation thread
to indicate it has finished.  Since the thread was killed off earlier due to an uncaught exception,
the boolean will never be set and the NM hangs during shutdown repeating something like this
every second in the log file:
> 2012-01-25 22:20:56,366 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
Waiting for aggregation to complete for application_1326848182580_2806

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message