hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-9111) NM crashes because Fair scheduler promotes a container that has not been pulled by AM
Date Tue, 11 Dec 2018 22:39:00 GMT
Haibo Chen created YARN-9111:
--------------------------------

             Summary: NM crashes because Fair scheduler promotes a container that has not
been pulled by AM
                 Key: YARN-9111
                 URL: https://issues.apache.org/jira/browse/YARN-9111
             Project: Hadoop YARN
          Issue Type: Sub-task
          Components: fairscheduler, nodemanager
    Affects Versions: YARN-1011
            Reporter: Haibo Chen


{code:java}
2018-10-19 22:34:35,052 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher
thread
 java.lang.NullPointerException
 at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:323)
 at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.handle(ContainerManagerImpl.java:1649)
 at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.handle(ContainerManagerImpl.java:185)
 at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
 at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
 at java.lang.Thread.run(Thread.java:748)
 2018-10-19 22:34:35,054 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
 2018-10-19 22:34:35,059 DEBUG org.apache.hadoop.service.AbstractService: Service: NodeManager
entered state STOPPED{code}
 

 
When a container is allocated by RM to an application, its container token is not generated
until the AM pulls that container from RM.

However, it the scheduler decides to promote that container before it is pulled by the AM,
it does not have container token to work with.

The current code does not update/generate the container token as such. When container promotion
is sent to NM to process, the NM crashes on NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message