flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7213) Introduce state management by OperatorID in TaskManager
Date Mon, 24 Jul 2017 12:26:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098285#comment-16098285
] 

ASF GitHub Bot commented on FLINK-7213:
---------------------------------------

Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4353#discussion_r129020863
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/OperatorSubtaskState.java
---
    @@ -18,20 +18,40 @@
     
     package org.apache.flink.runtime.checkpoint;
     
    +import org.apache.flink.annotation.VisibleForTesting;
     import org.apache.flink.runtime.state.CompositeStateHandle;
     import org.apache.flink.runtime.state.KeyedStateHandle;
     import org.apache.flink.runtime.state.OperatorStateHandle;
     import org.apache.flink.runtime.state.SharedStateRegistry;
     import org.apache.flink.runtime.state.StateObject;
     import org.apache.flink.runtime.state.StateUtil;
     import org.apache.flink.runtime.state.StreamStateHandle;
    +import org.apache.flink.util.Preconditions;
    +
     import org.slf4j.Logger;
     import org.slf4j.LoggerFactory;
     
    -import java.util.Arrays;
    +import javax.annotation.Nonnull;
    +import javax.annotation.Nullable;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
     
     /**
    - * Container for the state of one parallel subtask of an operator. This is part of the
{@link OperatorState}.
    + * This class encapsulates the state for one parallel instance of an operator. The complete
state of a (logical)
    + * operator (e.g. a flatmap operator) consists of the union of all {@link OperatorSubtaskState}s
from all
    + * parallel tasks that physically execute parallelized, physical instances of the operator.
    + * <p>The full state of the logical operator is represented by {@link OperatorState}
which consists of
    + * {@link OperatorSubtaskState}s.
    + * <p>Typically, we expect all collections in this class to be of size 0 or 1,
because there up to one state handle
    + * produced per state type (e.g. managed-keyed, raw-operator, ...). In particular, this
holds when taking a snapshot.
    + * The purpose of having the state handles in collections is that this class is also
reused in restoring state.
    + * Under normal circumstances, the expected size of each collection is still 0 or 1,
except for scale-down. In
    --- End diff --
    
    How come we don't need this in the current master, where this class is also used for restoring
state?


> Introduce state management by OperatorID in TaskManager
> -------------------------------------------------------
>
>                 Key: FLINK-7213
>                 URL: https://issues.apache.org/jira/browse/FLINK-7213
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.4.0
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>
> Flink-5892 introduced the job manager / checkpoint coordinator part of managing state
on the operator level instead of the task level by introducing explicit operator_id ->
state mappings. However, this explicit mapping was not introduced in the task manager side,
so the explicit mapping is still converted into a mapping that suits the implicit operator
chain order.
> We should also introduce explicit operator ids to state management on the task manager.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message