flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ufuk Celebi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4425) "Out Of Memory" during savepoint deserialization
Date Thu, 18 Aug 2016 16:01:21 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426698#comment-15426698
] 

Ufuk Celebi commented on FLINK-4425:
------------------------------------

Thanks for reporting this. 

(1) Is it possible to share your user program with some data?

If not possible,  could you (2) trigger the savepoint with the job having a MemoryStateBackend
and share the savepoint file? That way the savepoint will be self-contained and you can share
it here.

I can then try to reproduce it.

> "Out Of Memory" during savepoint deserialization
> ------------------------------------------------
>
>                 Key: FLINK-4425
>                 URL: https://issues.apache.org/jira/browse/FLINK-4425
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: Sergii Koshel
>
> I've created savepoint and trying to start job using it (via -s param) and getting exception
like below:
> {code:title=Exception|borderStyle=solid}
> java.lang.OutOfMemoryError: Java heap space
>         at org.apache.flink.runtime.checkpoint.savepoint.SavepointV1Serializer.deserialize(SavepointV1Serializer.java:167)
>         at org.apache.flink.runtime.checkpoint.savepoint.SavepointV1Serializer.deserialize(SavepointV1Serializer.java:42)
>         at org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.loadSavepoint(FsSavepointStore.java:133)
>         at org.apache.flink.runtime.checkpoint.savepoint.SavepointCoordinator.restoreSavepoint(SavepointCoordinator.java:201)
>         at org.apache.flink.runtime.executiongraph.ExecutionGraph.restoreSavepoint(ExecutionGraph.java:983)
>         at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1302)
>         at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291)
>         at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291)
>         at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>         at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>         at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>         at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>         at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}
> jobmanager.heap.mb: 1280
> taskmanager.heap.mb: 1024
> java 1.8
> savepoint + checkpoint size < 1 Mb in total



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message