hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-6695) Race condition in RM for publishing container events vs appFinished events causes NPE
Date Mon, 07 Jan 2019 22:20:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wangda Tan updated YARN-6695:
-----------------------------
    Target Version/s: 3.2.0, 3.1.2

> Race condition in RM for publishing container events vs appFinished events causes NPE

> --------------------------------------------------------------------------------------
>
>                 Key: YARN-6695
>                 URL: https://issues.apache.org/jira/browse/YARN-6695
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Rohith Sharma K S
>            Priority: Critical
>
> When RM publishes container events i.e by enabling *yarn.rm.system-metrics-publisher.emit-container-events*,
there is race condition for processing events 
> vs appFinished event that removes appId from collector list which cause NPE. 
> Look at the below trace where appId is removed from collectors first and then corresponding
events are processed. 
> {noformat}
> 2017-06-06 19:28:48,896 INFO  capacity.ParentQueue (ParentQueue.java:removeApplication(472))
- Application removed - appId: application_1496758895643_0005 user: root leaf-queue of parent:
root #applications: 0
> 2017-06-06 19:28:48,921 INFO  collector.TimelineCollectorManager (TimelineCollectorManager.java:remove(190))
- The collector service for application_1496758895643_0005 was removed
> 2017-06-06 19:28:48,922 ERROR metrics.TimelineServiceV2Publisher (TimelineServiceV2Publisher.java:putEntity(451))
- Error when publishing entity TimelineEntity[type='YARN_CONTAINER', id='container_e01_1496758895643_0005_01_000002']
> java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.putEntity(TimelineServiceV2Publisher.java:448)
> 	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.access$100(TimelineServiceV2Publisher.java:72)
> 	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:480)
> 	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:469)
> 	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:201)
> 	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:127)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message