flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Restart Flink in Yarn
Date Thu, 05 May 2016 11:49:08 GMT
Hi Dominic,
I'm sorry that you ran into this issue.
What do you mean by "flink streaming routes" ?

Regarding the second question: "Now I want to restart these routes to
continue their work from the last checkpoint. What can i do?"
I think the feature you are looking for are savepoints:
However, this has been added to Flink in 1.0, so its not available in your
0.10 release.

I have to admit that I haven't seen the "Cannot find required BLOB at ..."
exceptions before. Is there any chance that the files have been deleted
from the /tmp directory by any external service (like a periodic cleanup
script?) or has the /tmp dir been mounted to another disk in the meantime?

On Wed, May 4, 2016 at 6:27 PM, Dominique Rondé <dominique.ronde@allsecur.de
> wrote:

> Hi @all,
> i have a yarn cluster with 5 Nodes with a running flink (0.10.2) instance.
> Today we shut down one of the Yarn-Hosts due to maintance reasons. After
> the restart we have some flink streaming routes in a restarting status (see
> stacktrace below). Now I want to restart these routes to continue their
> work from the last checkpoint. What can i do?
> Greets
> Dominique
> Stacktrace
> ===================================================================================
> java.io.IOException: Cannot get library with hash 8f15fe4a8137ca2f9fb348ec634f3703f4fd7317
> 	at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerReferenceToBlobKeyAndGetURL(BlobLibraryCacheManager.java:254)
> 	at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:114)
> 	at org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:710)
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:471)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Failed to fetch BLOB 8f15fe4a8137ca2f9fb348ec634f3703f4fd7317
from / and store it under /tmp/blobStore-efdeddf9-d096-440f-a4cb-9c79334ff92c/cache/blob_8f15fe4a8137ca2f9fb348ec634f3703f4fd7317
> 	at org.apache.flink.runtime.blob.BlobCache.getURL(BlobCache.java:177)
> 	at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerReferenceToBlobKeyAndGetURL(BlobLibraryCacheManager.java:245)
> 	... 4 more
> Caused by: java.io.IOException: GET operation failed: Server side error: Cannot find
required BLOB at /tmp/blobStore-0f9a63e3-5700-4d47-aea7-310506c1496c/cache/blob_8f15fe4a8137ca2f9fb348ec634f3703f4fd7317
> 	at org.apache.flink.runtime.blob.BlobClient.get(BlobClient.java:165)
> 	at org.apache.flink.runtime.blob.BlobCache.getURL(BlobCache.java:125)
> 	... 5 more
> Caused by: java.io.IOException: Server side error: Cannot find required BLOB at /tmp/blobStore-0f9a63e3-5700-4d47-aea7-310506c1496c/cache/blob_8f15fe4a8137ca2f9fb348ec634f3703f4fd7317
> 	at org.apache.flink.runtime.blob.BlobClient.receiveAndCheckResponse(BlobClient.java:213)
> 	at org.apache.flink.runtime.blob.BlobClient.get(BlobClient.java:159)
> 	... 6 more
> Caused by: java.io.IOException: Cannot find required BLOB at /tmp/blobStore-0f9a63e3-5700-4d47-aea7-310506c1496c/cache/blob_8f15fe4a8137ca2f9fb348ec634f3703f4fd7317
> 	at org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:202)
> 	at org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:112)

View raw message