reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergiy Matusevych (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (REEF-1753) Intermittent failures of CloseEvaluatorTest in local runtime
Date Wed, 22 Mar 2017 00:13:41 GMT

    [ https://issues.apache.org/jira/browse/REEF-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935567#comment-15935567
] 

Sergiy Matusevych edited comment on REEF-1753 at 3/22/17 12:13 AM:
-------------------------------------------------------------------

I am able to reproduce the error with a lower log level. Here's the sale log fragment as before:
{code:none|title=driver.stderr}
2017-03-21 16:51:55,368 FINER reef.runtime.common.driver.DriverStatusManager.onComplete main
| ENTRY
2017-03-21 16:51:55,368 INFO reef.runtime.common.driver.DriverStatusManager.onComplete main
| Clean shutdown of the Driver.
2017-03-21 16:51:55,368 FINEST reef.runtime.common.driver.DriverStatusManager.onComplete main
| Call stack: 
java.lang.Exception: Dummy exception to get the call stack
	at org.apache.reef.runtime.common.driver.DriverStatusManager.onComplete(DriverStatusManager.java:144)
	at org.apache.reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle(DriverIdleManager.java:84)
	at org.apache.reef.runtime.common.driver.resourcemanager.ResourceManagerStatus.onRMRunning(ResourceManagerStatus.java:155)
	at org.apache.reef.runtime.common.driver.resourcemanager.ResourceManagerStatus.onNext(ResourceManagerStatus.java:95)
	at org.apache.reef.runtime.common.driver.resourcemanager.ResourceManagerStatus.onNext(ResourceManagerStatus.java:38)
	at org.apache.reef.runtime.local.driver.ResourceManager.sendRuntimeStatus(ResourceManager.java:256)
	at org.apache.reef.runtime.local.driver.ResourceManager.checkRequestQueue(ResourceManager.java:238)
	at org.apache.reef.runtime.local.driver.ResourceManager.onResourceReleaseRequest(ResourceManager.java:136)
	at org.apache.reef.runtime.local.driver.LocalResourceReleaseHandler.onNext(LocalResourceReleaseHandler.java:45)
	at org.apache.reef.runtime.local.driver.LocalResourceReleaseHandler.onNext(LocalResourceReleaseHandler.java:32)
	at org.apache.reef.runtime.common.driver.evaluator.EvaluatorManager$1.onNext(EvaluatorManager.java:269)
	at org.apache.reef.runtime.common.driver.evaluator.EvaluatorManager$1.onNext(EvaluatorManager.java:266)
	at org.apache.reef.wake.time.event.Alarm.run(Alarm.java:42)
	at org.apache.reef.wake.time.runtime.RuntimeClock.run(RuntimeClock.java:346)
	at org.apache.reef.runtime.common.REEFEnvironment.run(REEFEnvironment.java:166)
	at org.apache.reef.runtime.common.REEFLauncher.main(REEFLauncher.java:181)

2017-03-21 16:51:55,369 FINER reef.wake.time.runtime.RuntimeClock.close main | ENTRY
2017-03-21 16:51:55,370 FINE reef.wake.time.runtime.RuntimeClock.close main | Graceful shutdown
scheduled: org.apache.reef.wake.time.event.StopTime:[1490140315370|Tue Mar 21 16:51:55 PDT
2017] Outstanding client alarms: 0
2017-03-21 16:51:55,370 FINER reef.wake.time.runtime.RuntimeClock.close main | RETURN
2017-03-21 16:51:55,370 FINER reef.runtime.common.driver.DriverStatusManager.onComplete main
| RETURN
2017-03-21 16:51:55,371 FINEST reef.wake.time.runtime.RuntimeClock.run main | Enter clock
main loop.
2017-03-21 16:51:55,371 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoked for
event: org.apache.reef.wake.time.runtime.event.IdleClock:[1490140315371|Tue Mar 21 16:51:55
PDT 2017]
2017-03-21 16:51:55,373 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoking org.apache.reef.runtime.common.driver.idle.ClockIdlenessSource@2e377400
2017-03-21 16:51:55,373 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
main | Ignoring idle call from [Clock] for reason [The clock reported idle.]
2017-03-21 16:51:55,373 FINER reef.wake.time.runtime.RuntimeClock.run main | Process event:
org.apache.reef.wake.time.event.StopTime:[1490140315370|Tue Mar 21 16:51:55 PDT 2017] Outstanding
client alarms: 0
2017-03-21 16:51:55,374 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoked for
event: org.apache.reef.wake.time.event.StopTime:[1490140315370|Tue Mar 21 16:51:55 PDT 2017]
2017-03-21 16:51:55,374 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoking org.apache.reef.wake.impl.LoggingEventHandler@10aa41f2
2017-03-21 16:51:55,374 FINE reef.wake.impl.LoggingEventHandler.onNext main | org.apache.reef.wake.time.event.StopTime:[1490140315370|Tue
Mar 21 16:51:55 PDT 2017]
2017-03-21 16:51:55,374 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoked for
event: org.apache.reef.wake.time.runtime.event.RuntimeStop:[1490140315374|Tue Mar 21 16:51:55
PDT 2017]
2017-03-21 16:51:55,374 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoking org.apache.reef.runtime.common.driver.DriverRuntimeStopHandler@22fcf7ab
2017-03-21 16:51:55,374 FINE reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: start org.apache.reef.wake.time.runtime.event.RuntimeStop:[1490140315374|Tue
Mar 21 16:51:55 PDT 2017]
2017-03-21 16:51:55,375 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: close the evaluators
2017-03-21 16:51:55,375 FINER reef.runtime.common.driver.evaluator.Evaluators.close main |
Closing the evaluators - begin
2017-03-21 16:51:55,375 FINER reef.runtime.common.driver.evaluator.Evaluators.close main |
Closing the evaluators - end
2017-03-21 16:51:55,375 FINEST reef.runtime.local.driver.ContainerManager.close main | Clean
shutdown with no outstanding containers.
2017-03-21 16:51:55,375 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: notify the client
2017-03-21 16:51:55,376 FINEST reef.runtime.common.driver.client.ClientConnection.send main
| Sending:
identifier: "TEST_CloseEvaluatorTest"
state: DONE

2017-03-21 16:51:55,376 FINE reef.wake.remote.impl.ProxyEventHandler.onNext main | remoteid:
/10.130.68.120:16905
identifier: "TEST_CloseEvaluatorTest"
state: DONE

2017-03-21 16:51:55,377 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: close the remote manager
2017-03-21 16:51:55,377 FINER reef.runtime.common.utils.RemoteManager.close main | ENTRY
2017-03-21 16:51:55,378 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation.close
main | RemoteManager: REEF_LAUNCHER Closing remote manager id: socket://10.130.68.120:17209
2017-03-21 16:51:55,381 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | Link null for /10.130.68.120:16905
found
2017-03-21 16:51:55,381 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | No cached link for
/10.130.68.120:16905 thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003,5,main]
2017-03-21 16:51:55,384 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing sender stage socket://10.130.68.120:17209
2017-03-21 16:51:55,384 FINE reef.wake.remote.impl.RemoteSenderStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| close org.apache.reef.wake.remote.transport.netty.NettyMessagingTransport@43ba6512
2017-03-21 16:51:55,448 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
EvaluatorIdlenessThreadPool:pool-07:thread-004 | Ignoring idle call from [EventHandlers] for
reason [All events have been processed.]
2017-03-21 16:51:55,448 FINEST reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool$1.run
EvaluatorIdlenessThreadPool:pool-07:thread-004 | Idle check for Evaluator Node-2-1490140314783
- end
2017-03-21 16:51:55,500 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
EvaluatorIdlenessThreadPool:pool-07:thread-003 | Ignoring idle call from [EventHandlers] for
reason [All events have been processed.]
2017-03-21 16:51:55,500 FINEST reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool$1.run
EvaluatorIdlenessThreadPool:pool-07:thread-003 | Idle check for Evaluator Node-4-1490140314902
- end
2017-03-21 16:51:55,528 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
EvaluatorIdlenessThreadPool:pool-07:thread-005 | Ignoring idle call from [EventHandlers] for
reason [All events have been processed.]
2017-03-21 16:51:55,529 FINEST reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool$1.run
EvaluatorIdlenessThreadPool:pool-07:thread-005 | Idle check for Evaluator Node-3-1490140314976
- end
2017-03-21 16:51:55,719 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
EvaluatorIdlenessThreadPool:pool-07:thread-001 | Ignoring idle call from [EventHandlers] for
reason [All events have been processed.]
2017-03-21 16:51:55,719 FINEST reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool$1.run
EvaluatorIdlenessThreadPool:pool-07:thread-001 | Idle check for Evaluator Node-4-1490140315159
- end
2017-03-21 16:51:56,376 WARNING reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002 | Connection refused.
Retry 1 of 3
2017-03-21 16:51:56,493 WARNING reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | Connection refused.
Retry 1 of 3
2017-03-21 16:52:05,297 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-001 | Link null for /10.130.68.120:16905
found
2017-03-21 16:52:05,297 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-001 | No cached link for
/10.130.68.120:16905 thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-001,5,main]
2017-03-21 16:52:05,386 WARNING reef.wake.remote.impl.RemoteSenderStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Executor did not terminate in 10,000 ms.
2017-03-21 16:52:05,393 WARNING reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002 | Thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002,5,main]
interrupted while sleeping
2017-03-21 16:52:05,393 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002 | Link null for /10.130.68.120:16905
found
2017-03-21 16:52:05,394 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002 | No cached link for
/10.130.68.120:16905 thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002,5,main]
2017-03-21 16:52:05,394 WARNING reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | Thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003,5,main]
interrupted while sleeping
2017-03-21 16:52:05,394 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | Link null for /10.130.68.120:16905
found
2017-03-21 16:52:05,394 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | No cached link for
/10.130.68.120:16905 thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003,5,main]
2017-03-21 16:52:05,394 WARNING reef.wake.remote.impl.RemoteSenderStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Executor dropped 0 tasks.
2017-03-21 16:52:05,395 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closed the remote sender
stage
2017-03-21 16:52:05,395 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing transport socket://10.130.68.120:17209
2017-03-21 16:52:05,395 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.close
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing netty transport socket
address: /10.130.68.120:17209
2017-03-21 16:52:07,704 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.close
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing netty transport socket
address: /10.130.68.120:17209 done
2017-03-21 16:52:07,704 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closed the transport
2017-03-21 16:52:07,705 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing receiver stage socket://10.130.68.120:17209
2017-03-21 16:52:07,705 FINE reef.wake.remote.impl.OrderedRemoteReceiverStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Close PushExecutor begin
2017-03-21 16:52:07,705 FINE reef.wake.remote.impl.OrderedRemoteReceiverStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Close PushExecutor end
2017-03-21 16:52:07,706 FINE reef.wake.remote.impl.OrderedRemoteReceiverStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Close PullExecutor begin
2017-03-21 16:52:07,706 FINE reef.wake.remote.impl.OrderedRemoteReceiverStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Close PullExecutor end
2017-03-21 16:52:07,706 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closed the remote receiver
stage
2017-03-21 16:52:07,706 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation.close
main | Close executor terminated properly.
2017-03-21 16:52:07,706 FINER reef.runtime.common.utils.RemoteManager.close main | RETURN
2017-03-21 16:52:07,706 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: close the restart manager
2017-03-21 16:52:07,706 FINER reef.driver.restart.DriverRestartManager.close main | Closing
restart timer. Final state: NOT_RESTARTED
2017-03-21 16:52:07,707 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: close the idleness checker
2017-03-21 16:52:07,707 FINE reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool.close
main | EvaluatorIdlenessThreadPool shutdown: begin
2017-03-21 16:52:07,707 FINE reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool.close
main | EvaluatorIdlenessThreadPool shutdown: Terminated successfully
2017-03-21 16:52:07,707 INFO reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown complete
{code}
[~taegeonum] could you please take a look at this one? Thank you!


was (Author: motus):
I am able to reproduce the error with a lower log level. Here's the sale log fragment as before:
{code:none|title=driver.stderr}
2017-03-21 16:51:55,368 FINER reef.runtime.common.driver.DriverStatusManager.onComplete main
| ENTRY
2017-03-21 16:51:55,368 INFO reef.runtime.common.driver.DriverStatusManager.onComplete main
| Clean shutdown of the Driver.
2017-03-21 16:51:55,368 FINEST reef.runtime.common.driver.DriverStatusManager.onComplete main
| Call stack: 
java.lang.Exception: Dummy exception to get the call stack
	at org.apache.reef.runtime.common.driver.DriverStatusManager.onComplete(DriverStatusManager.java:144)
	at org.apache.reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle(DriverIdleManager.java:84)
	at org.apache.reef.runtime.common.driver.resourcemanager.ResourceManagerStatus.onRMRunning(ResourceManagerStatus.java:155)
	at org.apache.reef.runtime.common.driver.resourcemanager.ResourceManagerStatus.onNext(ResourceManagerStatus.java:95)
	at org.apache.reef.runtime.common.driver.resourcemanager.ResourceManagerStatus.onNext(ResourceManagerStatus.java:38)
	at org.apache.reef.runtime.local.driver.ResourceManager.sendRuntimeStatus(ResourceManager.java:256)
	at org.apache.reef.runtime.local.driver.ResourceManager.checkRequestQueue(ResourceManager.java:238)
	at org.apache.reef.runtime.local.driver.ResourceManager.onResourceReleaseRequest(ResourceManager.java:136)
	at org.apache.reef.runtime.local.driver.LocalResourceReleaseHandler.onNext(LocalResourceReleaseHandler.java:45)
	at org.apache.reef.runtime.local.driver.LocalResourceReleaseHandler.onNext(LocalResourceReleaseHandler.java:32)
	at org.apache.reef.runtime.common.driver.evaluator.EvaluatorManager$1.onNext(EvaluatorManager.java:269)
	at org.apache.reef.runtime.common.driver.evaluator.EvaluatorManager$1.onNext(EvaluatorManager.java:266)
	at org.apache.reef.wake.time.event.Alarm.run(Alarm.java:42)
	at org.apache.reef.wake.time.runtime.RuntimeClock.run(RuntimeClock.java:346)
	at org.apache.reef.runtime.common.REEFEnvironment.run(REEFEnvironment.java:166)
	at org.apache.reef.runtime.common.REEFLauncher.main(REEFLauncher.java:181)

2017-03-21 16:51:55,369 FINER reef.wake.time.runtime.RuntimeClock.close main | ENTRY
2017-03-21 16:51:55,370 FINE reef.wake.time.runtime.RuntimeClock.close main | Graceful shutdown
scheduled: org.apache.reef.wake.time.event.StopTime:[1490140315370|Tue Mar 21 16:51:55 PDT
2017] Outstanding client alarms: 0
2017-03-21 16:51:55,370 FINER reef.wake.time.runtime.RuntimeClock.close main | RETURN
2017-03-21 16:51:55,370 FINER reef.runtime.common.driver.DriverStatusManager.onComplete main
| RETURN
2017-03-21 16:51:55,371 FINEST reef.wake.time.runtime.RuntimeClock.run main | Enter clock
main loop.
2017-03-21 16:51:55,371 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoked for
event: org.apache.reef.wake.time.runtime.event.IdleClock:[1490140315371|Tue Mar 21 16:51:55
PDT 2017]
2017-03-21 16:51:55,373 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoking org.apache.reef.runtime.common.driver.idle.ClockIdlenessSource@2e377400
2017-03-21 16:51:55,373 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
main | Ignoring idle call from [Clock] for reason [The clock reported idle.]
2017-03-21 16:51:55,373 FINER reef.wake.time.runtime.RuntimeClock.run main | Process event:
org.apache.reef.wake.time.event.StopTime:[1490140315370|Tue Mar 21 16:51:55 PDT 2017] Outstanding
client alarms: 0
2017-03-21 16:51:55,374 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoked for
event: org.apache.reef.wake.time.event.StopTime:[1490140315370|Tue Mar 21 16:51:55 PDT 2017]
2017-03-21 16:51:55,374 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoking org.apache.reef.wake.impl.LoggingEventHandler@10aa41f2
2017-03-21 16:51:55,374 FINE reef.wake.impl.LoggingEventHandler.onNext main | org.apache.reef.wake.time.event.StopTime:[1490140315370|Tue
Mar 21 16:51:55 PDT 2017]
2017-03-21 16:51:55,374 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoked for
event: org.apache.reef.wake.time.runtime.event.RuntimeStop:[1490140315374|Tue Mar 21 16:51:55
PDT 2017]
2017-03-21 16:51:55,374 FINEST reef.wake.impl.PubSubEventHandler.onNext main | Invoking org.apache.reef.runtime.common.driver.DriverRuntimeStopHandler@22fcf7ab
2017-03-21 16:51:55,374 FINE reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: start org.apache.reef.wake.time.runtime.event.RuntimeStop:[1490140315374|Tue
Mar 21 16:51:55 PDT 2017]
2017-03-21 16:51:55,375 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: close the evaluators
2017-03-21 16:51:55,375 FINER reef.runtime.common.driver.evaluator.Evaluators.close main |
Closing the evaluators - begin
2017-03-21 16:51:55,375 FINER reef.runtime.common.driver.evaluator.Evaluators.close main |
Closing the evaluators - end
2017-03-21 16:51:55,375 FINEST reef.runtime.local.driver.ContainerManager.close main | Clean
shutdown with no outstanding containers.
2017-03-21 16:51:55,375 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: notify the client
2017-03-21 16:51:55,376 FINEST reef.runtime.common.driver.client.ClientConnection.send main
| Sending:
identifier: "TEST_CloseEvaluatorTest"
state: DONE

2017-03-21 16:51:55,376 FINE reef.wake.remote.impl.ProxyEventHandler.onNext main | remoteid:
/10.130.68.120:16905
identifier: "TEST_CloseEvaluatorTest"
state: DONE

2017-03-21 16:51:55,377 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: close the remote manager
2017-03-21 16:51:55,377 FINER reef.runtime.common.utils.RemoteManager.close main | ENTRY
2017-03-21 16:51:55,378 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation.close
main | RemoteManager: REEF_LAUNCHER Closing remote manager id: socket://10.130.68.120:17209
2017-03-21 16:51:55,381 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | Link null for /10.130.68.120:16905
found
2017-03-21 16:51:55,381 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | No cached link for
/10.130.68.120:16905 thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003,5,main]
2017-03-21 16:51:55,384 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing sender stage socket://10.130.68.120:17209
2017-03-21 16:51:55,384 FINE reef.wake.remote.impl.RemoteSenderStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| close org.apache.reef.wake.remote.transport.netty.NettyMessagingTransport@43ba6512
2017-03-21 16:51:55,448 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
EvaluatorIdlenessThreadPool:pool-07:thread-004 | Ignoring idle call from [EventHandlers] for
reason [All events have been processed.]
2017-03-21 16:51:55,448 FINEST reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool$1.run
EvaluatorIdlenessThreadPool:pool-07:thread-004 | Idle check for Evaluator Node-2-1490140314783
- end
2017-03-21 16:51:55,500 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
EvaluatorIdlenessThreadPool:pool-07:thread-003 | Ignoring idle call from [EventHandlers] for
reason [All events have been processed.]
2017-03-21 16:51:55,500 FINEST reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool$1.run
EvaluatorIdlenessThreadPool:pool-07:thread-003 | Idle check for Evaluator Node-4-1490140314902
- end
2017-03-21 16:51:55,528 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
EvaluatorIdlenessThreadPool:pool-07:thread-005 | Ignoring idle call from [EventHandlers] for
reason [All events have been processed.]
2017-03-21 16:51:55,529 FINEST reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool$1.run
EvaluatorIdlenessThreadPool:pool-07:thread-005 | Idle check for Evaluator Node-3-1490140314976
- end
2017-03-21 16:51:55,719 FINEST reef.runtime.common.driver.idle.DriverIdleManager.onPotentiallyIdle
EvaluatorIdlenessThreadPool:pool-07:thread-001 | Ignoring idle call from [EventHandlers] for
reason [All events have been processed.]
2017-03-21 16:51:55,719 FINEST reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool$1.run
EvaluatorIdlenessThreadPool:pool-07:thread-001 | Idle check for Evaluator Node-4-1490140315159
- end
2017-03-21 16:51:56,376 WARNING reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002 | Connection refused.
Retry 1 of 3
2017-03-21 16:51:56,493 WARNING reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | Connection refused.
Retry 1 of 3
2017-03-21 16:52:05,297 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-001 | Link null for /10.130.68.120:16905
found
2017-03-21 16:52:05,297 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-001 | No cached link for
/10.130.68.120:16905 thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-001,5,main]
2017-03-21 16:52:05,386 WARNING reef.wake.remote.impl.RemoteSenderStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Executor did not terminate in 10,000 ms.
2017-03-21 16:52:05,393 WARNING reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002 | Thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002,5,main]
interrupted while sleeping
2017-03-21 16:52:05,393 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002 | Link null for /10.130.68.120:16905
found
2017-03-21 16:52:05,394 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002 | No cached link for
/10.130.68.120:16905 thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-002,5,main]
2017-03-21 16:52:05,394 WARNING reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | Thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003,5,main]
interrupted while sleeping
2017-03-21 16:52:05,394 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | Link null for /10.130.68.120:16905
found
2017-03-21 16:52:05,394 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.open
org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003 | No cached link for
/10.130.68.120:16905 thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003,5,main]
2017-03-21 16:52:05,394 WARNING reef.wake.remote.impl.RemoteSenderStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Executor dropped 0 tasks.
2017-03-21 16:52:05,395 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closed the remote sender
stage
2017-03-21 16:52:05,395 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing transport socket://10.130.68.120:17209
2017-03-21 16:52:05,395 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.close
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing netty transport socket
address: /10.130.68.120:17209
2017-03-21 16:52:07,704 FINE reef.wake.remote.transport.netty.NettyMessagingTransport.close
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing netty transport socket
address: /10.130.68.120:17209 done
2017-03-21 16:52:07,704 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closed the transport
2017-03-21 16:52:07,705 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closing receiver stage socket://10.130.68.120:17209
2017-03-21 16:52:07,705 FINE reef.wake.remote.impl.OrderedRemoteReceiverStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Close PushExecutor begin
2017-03-21 16:52:07,705 FINE reef.wake.remote.impl.OrderedRemoteReceiverStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Close PushExecutor end
2017-03-21 16:52:07,706 FINE reef.wake.remote.impl.OrderedRemoteReceiverStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Close PullExecutor begin
2017-03-21 16:52:07,706 FINE reef.wake.remote.impl.OrderedRemoteReceiverStage.close CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209
| Close PullExecutor end
2017-03-21 16:52:07,706 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation$1.run
CLOSE:RemoteManager:REEF_LAUNCHER:socket://10.130.68.120:17209 | Closed the remote receiver
stage
2017-03-21 16:52:07,706 FINE reef.wake.remote.impl.DefaultRemoteManagerImplementation.close
main | Close executor terminated properly.
2017-03-21 16:52:07,706 FINER reef.runtime.common.utils.RemoteManager.close main | RETURN
2017-03-21 16:52:07,706 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: close the restart manager
2017-03-21 16:52:07,706 FINER reef.driver.restart.DriverRestartManager.close main | Closing
restart timer. Final state: NOT_RESTARTED
2017-03-21 16:52:07,707 FINER reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown: close the idleness checker
2017-03-21 16:52:07,707 FINE reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool.close
main | EvaluatorIdlenessThreadPool shutdown: begin
2017-03-21 16:52:07,707 FINE reef.runtime.common.driver.evaluator.EvaluatorIdlenessThreadPool.close
main | EvaluatorIdlenessThreadPool shutdown: Terminated successfully
2017-03-21 16:52:07,707 INFO reef.runtime.common.driver.DriverRuntimeStopHandler.onNext main
| Driver shutdown complete
{code}


> Intermittent failures of CloseEvaluatorTest in local runtime
> ------------------------------------------------------------
>
>                 Key: REEF-1753
>                 URL: https://issues.apache.org/jira/browse/REEF-1753
>             Project: REEF
>          Issue Type: Bug
>          Components: REEF Driver, Wake
>            Reporter: Sergiy Matusevych
>              Labels: shutdown
>
> {{mvn test}} reports:
> {code:none}
> Failed tests:
>   CloseEvaluatorTest.testCloseEvaluator:61 Job state after execution: FORCE_CLOSED
> {code}
> and in the driver logs I see the following:
> {code:none|title=faulty driver.stderr}
> Mar 21, 2017 2:06:51 PM org.apache.reef.runtime.common.driver.DriverStatusManager onComplete
> INFO: Clean shutdown of the Driver.
> Mar 21, 2017 2:06:52 PM org.apache.reef.wake.remote.transport.netty.NettyMessagingTransport
open
> WARNING: Connection refused. Retry 1 of 3
> Mar 21, 2017 2:06:52 PM org.apache.reef.wake.remote.transport.netty.NettyMessagingTransport
open
> WARNING: Connection refused. Retry 1 of 3
> Mar 21, 2017 2:07:01 PM org.apache.reef.wake.remote.impl.RemoteSenderStage close
> WARNING: Executor did not terminate in 10,000 ms.
> Mar 21, 2017 2:07:01 PM org.apache.reef.wake.remote.transport.netty.NettyMessagingTransport
open
> WARNING: Thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-003,5,main]
interrupted while sleeping
> Mar 21, 2017 2:07:01 PM org.apache.reef.wake.remote.transport.netty.NettyMessagingTransport
open
> WARNING: Thread Thread[org.apache.reef.wake.remote.impl.RemoteSenderStage:pool-06:thread-001,5,main]
interrupted while sleeping
> Mar 21, 2017 2:07:01 PM org.apache.reef.wake.remote.impl.RemoteSenderStage close
> WARNING: Executor dropped 0 tasks.
> Mar 21, 2017 2:07:04 PM org.apache.reef.runtime.common.driver.DriverRuntimeStopHandler
onNext
> INFO: Driver shutdown complete
> {code}
> Instead of
> {code:none|title=normal driver.stderr}
> Mar 21, 2017 3:18:46 PM org.apache.reef.runtime.common.driver.DriverStatusManager onComplete
> INFO: Clean shutdown of the Driver.
> Mar 21, 2017 3:18:48 PM org.apache.reef.runtime.common.driver.DriverRuntimeStopHandler
onNext
> INFO: Driver shutdown complete
> {code}
> *P.S.* Unfortunately, I am unable to reproduce the error with more granular logging level.
If someone can do that, please post your Driver's log here



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message