hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4711) NPE in NMTimelinePublisher
Date Thu, 25 Feb 2016 20:09:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167776#comment-15167776
] 

Naganarasimha G R commented on YARN-4711:
-----------------------------------------

As per initial analysis i am able to reproduce these issues when the containers run for little
longer duration (due to which NM can report more container resource usages). So possible cause
would be that YARN-3367 is making all the timeline entity Publishes in a single threaded and
in NM all are published as Sync events. Tried changing  container resource usages as asyncs
events and was able to avoid these exceptions. Will test more and confirm.

> NPE in NMTimelinePublisher
> --------------------------
>
>                 Key: YARN-4711
>                 URL: https://issues.apache.org/jira/browse/YARN-4711
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>            Priority: Critical
>
> While testing the latest 2928 branch came across a NPE which is shutting down the NM
> {code}
> 2016-02-21 23:19:54,078 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in
dispatcher thread
> java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:306)
>         at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:296)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> Seems to be a race condition ...
> Few other NPE's faced :
> {code}
> java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.putEntity(NMTimelinePublisher.java:213)
>         at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerFinishedEvent(NMTimelinePublisher.java:192)
>         at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.access$400(NMTimelinePublisher.java:63)
>         at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:289)
>         at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:280)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> * Also there is possibility of NPE in TimelineEntity.toString() when real is not null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message