hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (Jira)" <j...@apache.org>
Subject [jira] [Commented] (YARN-10240) Prevent Fatal CancelledException in TimelineV2Client when stopping
Date Tue, 21 Apr 2020 06:49:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088351#comment-17088351
] 

Hudson commented on YARN-10240:
-------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18170 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18170/])
YARN-10240. Prevent Fatal CancelledException in TimelineV2Client when (pjoseph: rev 60fa15366e8f2d59f4dc8e7beaa6edcbbcb9c18f)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineV2ClientImpl.java


> Prevent Fatal CancelledException in TimelineV2Client when stopping
> ------------------------------------------------------------------
>
>                 Key: YARN-10240
>                 URL: https://issues.apache.org/jira/browse/YARN-10240
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: ATSv2
>            Reporter: Tarun Parimi
>            Assignee: Tarun Parimi
>            Priority: Major
>             Fix For: 3.4.0
>
>         Attachments: YARN-10240.001.patch
>
>
> When the timeline client is stopped, it will cancel all sync EntityHolders after waiting
for a drain timeout.
> {code:java}
> // if some entities were not drained then we need interrupt
>                   // the threads which had put sync EntityHolders to the queue.
>                   EntitiesHolder nextEntityInTheQueue = null;
>                   while ((nextEntityInTheQueue =
>                       timelineEntityQueue.poll()) != null) {
>                     nextEntityInTheQueue.cancel(true);
>                   }
> {code}
> We only handle interrupted exception here.
> {code:java}
> if (sync) {
>         // In sync call we need to wait till its published and if any error then
>         // throw it back
>         try {
>           entitiesHolder.get();
>         } catch (ExecutionException e) {
>           throw new YarnException("Failed while publishing entity",
>               e.getCause());
>         } catch (InterruptedException e) {
>           Thread.currentThread().interrupt();
>           throw new YarnException("Interrupted while publishing entity", e);
>         }
>       }
> {code}
>  But calling nextEntityInTheQueue.cancel(true) will result in entitiesHolder.get() throwing
a CancelledException which is not handled. This can result in FATAL error in NM. We need to
prevent this.
> {code:java}
> FATAL event.AsyncDispatcher (AsyncDispatcher.java:dispatch(203)) - Error in dispatcher
thread
> java.util.concurrent.CancellationException
> 	at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> 	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.dispatchEntities(TimelineV2ClientImpl.java:545)
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putEntities(TimelineV2ClientImpl.java:149)
> 	at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.putEntity(NMTimelinePublisher.java:348)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message