hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4695) EntityGroupFSTimelineStore to not log errors during shutdown
Date Tue, 16 Feb 2016 14:16:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148643#comment-15148643
] 

Steve Loughran commented on YARN-4695:
--------------------------------------

Fuller stack. What's happening is the RM has been shut down (to be precise: there isn't an
RM running. not even a little one). the log parser is asking for the latest  list of running
apps, and timing out; this puts IPC into retry, which stops the shutdown working.
{code}
016-02-16 14:13:28,126 [IPC Server Responder] INFO  ipc.Server (Server.java:run(959)) - Stopping
IPC Server Responder
2016-02-16 14:13:28,126 [ScalaTest-main-running-TimelineListenerSuite] INFO  timeline.EntityGroupFSTimelineStore
(EntityGroupFSTimelineStore.java:serviceStop(275)) - Stopping EntityGroupFSTimelineStore
2016-02-16 14:13:28,127 [ScalaTest-main-running-TimelineListenerSuite] INFO  timeline.EntityGroupFSTimelineStore
(EntityGroupFSTimelineStore.java:serviceStop(279)) - Waiting for executor to terminate
2016-02-16 14:13:28,966 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 0 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:29,970 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 1 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:30,975 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 2 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:31,980 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 3 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:32,986 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 4 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:33,995 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 5 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:35,000 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 6 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:36,005 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 7 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:37,010 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 8 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:38,015 [EntityLogPluginWorker #1] INFO  ipc.Client (Client.java:handleConnectionFailure(897))
- Retrying connect to server: www.bbc.co.uk/0.0.0.0:8032. Already tried 9 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-02-16 14:13:38,133 [ScalaTest-main-running-TimelineListenerSuite] WARN  timeline.EntityGroupFSTimelineStore
(EntityGroupFSTimelineStore.java:serviceStop(284)) - Executor did not terminate
2016-02-16 14:13:38,133 [ScalaTest-main-running-TimelineListenerSuite] INFO  timeline.LeveldbTimelineStore
(LeveldbTimelineStore.java:serviceStop(250)) - Waiting for deletion thread to complete its
current action
2016-02-16 14:13:38,133 [Thread-9] INFO  timeline.LeveldbTimelineStore (LeveldbTimelineStore.java:run(296))
- Deletion thread received interrupt, exiting
2016-02-16 14:13:38,134 [EntityLogPluginWorker #1] ERROR timeline.EntityGroupFSTimelineStore
(EntityGroupFSTimelineStore.java:run(693)) - Error processing logs for application_1111_0000
java.lang.reflect.UndeclaredThrowableException
	at com.sun.proxy.$Proxy25.getApplicationReport(Unknown Source)
	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:448)
	at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getAppState(EntityGroupFSTimelineStore.java:464)
	at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.access$700(EntityGroupFSTimelineStore.java:79)
	at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore$AppLogs.parseSummaryLogs(EntityGroupFSTimelineStore.java:529)
	at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore$AppLogs.parseSummaryLogs(EntityGroupFSTimelineStore.java:519)
	at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore$ActiveLogParser.run(EntityGroupFSTimelineStore.java:686)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException: sleep interrupted
	at java.lang.Thread.sleep(Native Method)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:159)
	... 14 more
{code}

> EntityGroupFSTimelineStore to not log errors during shutdown
> ------------------------------------------------------------
>
>                 Key: YARN-4695
>                 URL: https://issues.apache.org/jira/browse/YARN-4695
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>
> # The {{EntityGroupFSTimelineStore}} threads log exceptions that get raised during their
execution.
> # the service stops by interrupting all its workers
> # as a result, the workers all log exceptions at error *even during a managed shutdown*
> # this creates distracting noise in logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message