flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shannon Carey <sca...@expedia.com>
Subject blob store defaults to /tmp and files get deleted
Date Fri, 17 Feb 2017 18:24:28 GMT
A few of my jobs recently failed and showed this exception:

org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot load user class: org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09
ClassLoader info: URL ClassLoader:
    file: '/tmp/blobStore-5f023409-6af5-4de6-8ed0-e80a2eb9633e/cache/blob_d9a9fb884f3b436030afcf7b8e1bce678acceaf2'
(invalid JAR: zip file is empty)
Class not resolvable through given classloader.
        at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:208)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:224)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:642)
        at java.lang.Thread.run(Thread.java:745)

As you can see, Flink is storing things underneath /tmp, which is the (undocumented) default
for the blob store. As you may know, on Linux, there's typically a program such as tmpwatch
which is run periodically to clear out data from /tmp.

Flink also uses /tmp as the default for jobmanager.web.tmpdir (and jobmanager.web.upload.dir
in 1.2).

Therefore, assuming that this is indeed the cause of the job failure/the exception, it seems
highly advisable that when you run a Flink cluster you configure blob.storage.directory and
jobmanager.web.tmpdir to a specific folder that is not beneath /tmp. I don't know if there
is any material about setting up a production cluster, but this would definitely seem to be
a necessary configuration to specify if you want to avoid problems. Enabling High Availability
mode should also be on that list, I think.

View raw message