hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith Sharma K S (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-6695) Race condition in RM for publishing container events vs appFinished events causes NPE
Date Wed, 07 Jun 2017 06:22:18 GMT
Rohith Sharma K S created YARN-6695:
---------------------------------------

             Summary: Race condition in RM for publishing container events vs appFinished
events causes NPE 
                 Key: YARN-6695
                 URL: https://issues.apache.org/jira/browse/YARN-6695
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Rohith Sharma K S


When RM publishes container events i.e by enabling *yarn.rm.system-metrics-publisher.emit-container-events*,
there is race condition for processing events 
vs appFinished event that removes appId from collector list which cause NPE. 

Look at the below trace where appId is removed from collectors first and then corresponding
events are processed. 
{noformat}
2017-06-06 19:28:48,896 INFO  capacity.ParentQueue (ParentQueue.java:removeApplication(472))
- Application removed - appId: application_1496758895643_0005 user: root leaf-queue of parent:
root #applications: 0
2017-06-06 19:28:48,921 INFO  collector.TimelineCollectorManager (TimelineCollectorManager.java:remove(190))
- The collector service for application_1496758895643_0005 was removed
2017-06-06 19:28:48,922 ERROR metrics.TimelineServiceV2Publisher (TimelineServiceV2Publisher.java:putEntity(451))
- Error when publishing entity TimelineEntity[type='YARN_CONTAINER', id='container_e01_1496758895643_0005_01_000002']
java.lang.NullPointerException
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.putEntity(TimelineServiceV2Publisher.java:448)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.access$100(TimelineServiceV2Publisher.java:72)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:480)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:469)
	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:201)
	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:127)
	at java.lang.Thread.run(Thread.java:745)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message