Hi Darshan,

This is a known problem with Flink, and no specific exception information is given, making diagnosis more difficult. 
I personally guess that you are using a local file system, which may be the cause of the problem. 
Can you specify a HDFS with access permission for Savepoint?

Thanks, vino.

2018-07-30 23:23 GMT+08:00 Darshan Singh <darshan.meel@gmail.com>:
I am trying to submit a job with the savepoint/checkpoint and it is failing with below error. Without -s flag it works fine. Am i missing something here?


Thanks

>bin/flink run -d  -c st -s file:///tmp/db/checkpoint/ ./target/poc-1.0-SNAPSHOT-jar-with-dependencies.jar

Starting execution of program


------------------------------------------------------------

 The program finished with the following exception:


org.apache.flink.client.program.ProgramInvocationException: Could not submit job c70ab528e98178bb7b9d8c622511e9f5.

at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:247)

at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)

at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)

at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:410)

at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:785)

at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:279)

at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:214)

at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1025)

at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)

at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)

at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101)

Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.

at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:370)

at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)

at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)

at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)

at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)

at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:214)

at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)

at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)

at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)

at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)

at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)

at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Exception is not retryable.

at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)

at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)

at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)

at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)

... 12 more

Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Exception is not retryable.

... 10 more

Caused by: java.util.concurrent.CompletionException: org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]

at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)

at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)

at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)

at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)

at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)

... 4 more

Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]

at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:309)

at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:293)

at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)

... 5 more