flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5892) Recover job state at the granularity of operator
Date Tue, 20 Jun 2017 07:30:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16055278#comment-16055278
] 

ASF GitHub Bot commented on FLINK-5892:
---------------------------------------

Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/3844
  
    Are you sure that errors in travis are intermittent or unrelated to your change? One is
already reported here: https://issues.apache.org/jira/browse/FLINK-6843 but second one I'm
not sure:
    
    ```
    Tests run: 10, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.933 sec <<<
FAILURE! - in org.apache.flink.runtime.state.OperatorStateBackendTest
    testSnapshotAsyncCancel(org.apache.flink.runtime.state.OperatorStateBackendTest)  Time
elapsed: 0.061 sec  <<< ERROR!
    java.util.concurrent.ExecutionException: java.io.IOException: Stream closed.
    	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    	at java.util.concurrent.FutureTask.get(FutureTask.java:206)
    	at org.apache.flink.runtime.state.OperatorStateBackendTest.testSnapshotAsyncCancel(OperatorStateBackendTest.java:636)
    Caused by: java.io.IOException: Stream closed.
    	at org.apache.flink.runtime.util.BlockerCheckpointStreamFactory$1.write(BlockerCheckpointStreamFactory.java:95)
    	at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
    	at org.apache.flink.core.io.VersionedIOReadableWritable.write(VersionedIOReadableWritable.java:40)
    	at org.apache.flink.runtime.state.OperatorBackendSerializationProxy.write(OperatorBackendSerializationProxy.java:65)
    	at org.apache.flink.runtime.state.DefaultOperatorStateBackend$1.performOperation(DefaultOperatorStateBackend.java:255)
    	at org.apache.flink.runtime.state.DefaultOperatorStateBackend$1.performOperation(DefaultOperatorStateBackend.java:233)
    	at org.apache.flink.runtime.io.async.AbstractAsyncIOCallable.call(AbstractAsyncIOCallable.java:72)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    	at java.lang.Thread.run(Thread.java:745)
    ```


> Recover job state at the granularity of operator
> ------------------------------------------------
>
>                 Key: FLINK-5892
>                 URL: https://issues.apache.org/jira/browse/FLINK-5892
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>            Reporter: Guowei Ma
>            Assignee: Guowei Ma
>             Fix For: 1.3.0
>
>
> JobGraph has no `Operator` info so `ExecutionGraph` can only recovery at the granularity
of task.
> This leads to the result that the operator of the job may not recover the state from
a save point even if the save point has the state of operator. 
>  https://docs.google.com/document/d/19suTRF0nh7pRgeMnIEIdJ2Fq-CcNVt5_hR3cAoICf7Q/edit#.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message