flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sihua Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-7873) Introduce HybridStreamStateHandle for quick recovery from checkpoint.
Date Thu, 19 Oct 2017 11:44:00 GMT
Sihua Zhou created FLINK-7873:

             Summary: Introduce HybridStreamStateHandle for quick recovery from checkpoint.
                 Key: FLINK-7873
                 URL: https://issues.apache.org/jira/browse/FLINK-7873
             Project: Flink
          Issue Type: New Feature
          Components: State Backends, Checkpointing
    Affects Versions: 1.3.2
            Reporter: Sihua Zhou
            Assignee: Sihua Zhou

Current recovery strategy will always read checkpoint data from remote FileStream (HDFS).
This will cost a lot of network when the state is so big (e.g. 1T), this cost can be saved
by reading the checkpoint data from local disk. So i introduce a HybridStreamStateHandler
which try to create a local input stream first, if failed, it then create a remote input stream,
it prototype looks like below:
class HybridStreamHandle {
   private FileStateHandle localHandle;
   private FileStateHandle remoteHandle;
   public FSDataInputStream openInputStream() throws IOException {
        FSDataInputStream inputStream = localHandle.openInputStream();
        return inputStream != null ? inputStream : remoteHandle.openInputStream();

This message was sent by Atlassian JIRA

View raw message