flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8360) Implement task-local state recovery
Date Wed, 14 Feb 2018 13:42:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364041#comment-16364041
] 

ASF GitHub Bot commented on FLINK-8360:
---------------------------------------

Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5239#discussion_r168176031
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/state/TaskStateManager.java
---
    @@ -60,4 +62,9 @@ void reportTaskStateSnapshots(
     	 * @return previous state for the operator. Null if no previous state exists.
     	 */
     	OperatorSubtaskState operatorStates(OperatorID operatorID);
    +
    +	/**
    +	 * Returns the base directory for all file-based local state of the owning subtask.
    +	 */
    +	File getSubtaskLocalStateBaseDirectory();
    --- End diff --
    
    This is the manager, not the state objects. So for local recovery that is not based on
local files, the backend will just not care about directories and no invoke this method.


> Implement task-local state recovery
> -----------------------------------
>
>                 Key: FLINK-8360
>                 URL: https://issues.apache.org/jira/browse/FLINK-8360
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>            Priority: Major
>             Fix For: 1.5.0
>
>
> This issue tracks the development of recovery from task-local state. The main idea is
to have a secondary, local copy of the checkpointed state, while there is still a primary
copy in DFS that we report to the checkpoint coordinator.
> Recovery can attempt to restore from the secondary local copy, if available, to save
network bandwidth. This requires that the assignment from tasks to slots is as sticky is possible.
> For starters, we will implement this feature for all managed keyed states and can easily
enhance it to all other state types (e.g. operator state) later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message