flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1492) Exceptions on shutdown concerning BLOB store cleanup
Date Tue, 10 Feb 2015 10:56:34 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314010#comment-14314010
] 

ASF GitHub Bot commented on FLINK-1492:
---------------------------------------

Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/376#discussion_r24403795
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServer.java ---
    @@ -196,23 +199,28 @@ public void run() {
     	 */
     	@Override
     	public void shutdown() throws IOException {
    -
    -		this.shutdownRequested = true;
    -		try {
    -			this.serverSocket.close();
    -		} catch (IOException ioe) {
    +		if (shutdownRequested.compareAndSet(false, true)) {
    +			try {
    +				this.serverSocket.close();
    +			}
    +			catch (IOException ioe) {
     				LOG.debug("Error while closing the server socket.", ioe);
    -		}
    -		try {
    -			join();
    -		} catch (InterruptedException ie) {
    -			LOG.debug("Error while waiting for this thread to die.", ie);
    -		}
    +			}
    +			try {
    +				join();
    --- End diff --
    
    It's from the old code and I am not sure if it really needs to stay, but it ensures that
the BlobServer thread really finishes when calling the shutdown method (BlobServer is a Thread
and because the join is called from outside of the run method it waits for the BlobServer
thread to finish).


> Exceptions on shutdown concerning BLOB store cleanup
> ----------------------------------------------------
>
>                 Key: FLINK-1492
>                 URL: https://issues.apache.org/jira/browse/FLINK-1492
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager, TaskManager
>    Affects Versions: 0.9
>            Reporter: Stephan Ewen
>            Assignee: Ufuk Celebi
>             Fix For: 0.9
>
>
> The following stack traces occur not every time, but frequently.
> {code}
> java.lang.IllegalArgumentException: /tmp/blobStore-7a89856a-47f9-45d6-b88b-981a3eff1982
does not exist
> 	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
> 	at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> 	at org.apache.flink.runtime.blob.BlobServer.shutdown(BlobServer.java:213)
> 	at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
> 	at org.apache.flink.runtime.jobmanager.JobManager.postStop(JobManager.scala:136)
> 	at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
> 	at org.apache.flink.runtime.jobmanager.JobManager.aroundPostStop(JobManager.scala:80)
> 	at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
> 	at akka.actor.dungeon.FaultHandling$class.handleChildTerminated(FaultHandling.scala:292)
> 	at akka.actor.ActorCell.handleChildTerminated(ActorCell.scala:369)
> 	at akka.actor.dungeon.DeathWatch$class.watchedActorTerminated(DeathWatch.scala:63)
> 	at akka.actor.ActorCell.watchedActorTerminated(ActorCell.scala:369)
> 	at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:455)
> 	at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
> 	at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
> 	at akka.dispatch.Mailbox.run(Mailbox.scala:220)
> 	at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,350 ERROR org.apache.flink.test.util.ForkableFlinkMiniCluster$$anonfun$startTaskManager$1$$anon$1
 - LibraryCacheManager did not shutdown properly.
> java.io.IOException: Unable to delete file: /tmp/blobStore-e2619536-fb7c-452a-8639-487a074d1582/cache/blob_ff74895f7bdeeaa3bd70b6932beed143048bb4c7
> 	at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
> 	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
> 	at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> 	at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
> 	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
> 	at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> 	at org.apache.flink.runtime.blob.BlobCache.shutdown(BlobCache.java:159)
> 	at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.shutdown(BlobLibraryCacheManager.java:171)
> 	at org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:173)
> 	at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
> 	at org.apache.flink.runtime.taskmanager.TaskManager.aroundPostStop(TaskManager.scala:86)
> 	at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
> 	at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
> 	at akka.actor.ActorCell.terminate(ActorCell.scala:369)
> 	at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
> 	at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
> 	at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
> 	at akka.dispatch.Mailbox.run(Mailbox.scala:220)
> 	at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 15:16:15,345 ERROR org.apache.flink.runtime.blob.BlobCache                       - Error
deleting directory /tmp/blobStore-4313349e-8a58-4683-9fd0-3d2c52be1864 during JVM shutdown:
/tmp/blobStore-4313349e-8a58-4683-9fd0-3d2c52be1864 does not exist
> java.lang.IllegalArgumentException: /tmp/blobStore-4313349e-8a58-4683-9fd0-3d2c52be1864
does not exist
> 	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
> 	at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> 	at org.apache.flink.runtime.blob.BlobUtils$1.run(BlobUtils.java:210)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message