flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kostas Kloudas (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-7771) Make the operator state queryable
Date Fri, 06 Oct 2017 08:31:03 GMT
Kostas Kloudas created FLINK-7771:

             Summary: Make the operator state queryable
                 Key: FLINK-7771
                 URL: https://issues.apache.org/jira/browse/FLINK-7771
             Project: Flink
          Issue Type: Improvement
          Components: Queryable State
    Affects Versions: 1.4.0
            Reporter: Kostas Kloudas
            Assignee: Kostas Kloudas
             Fix For: 1.4.0

There seem to be some requests for making the operator (non-keyed) state queryable. This means
that the user will specify the *uuid* of the operator and the *taskId*, and he will be able
to access the state that corresponds to that operator and for that specific task.

This issue will serve to document the discussion on the topic, so that everybody can participate.

Personally, I think that such a feature should wait until some things on state handling are
stabilized (_e.g._ replication and checkpoint management). My main concerns have to do with
the semantics and guarantees that such a feature could offer *for now*. 

 At first, operator state is essentially a list state that can be reshuffled arbitrarily upon
restoring or rescaling. This means that task1 will have at a given execution attempt elements
_A,B,C_ while after restoring (even without rescaling) it may have _D,B,E_ without this implying
that something happened to states _A_ and _C_. They were simply assigned to another task.
This makes it hard to reason about the results that you get at any point in time, as it provides
*no locality/consistency guarantees between executions*.

 The above, in combination with the fact that (for now) it is not possible to query the state
at a specific point in time (_e.g._ the last checkpointed state), means that there is no easy
way to get a consistent view of the state of an operator. So in the example above, when querying
_(operatorA, task1)_ and _(operatorA, task2)_, the user can get states belonging to different
"points in time" which can result to duplicates, lost values and all the problems encountered
in distributed systems when there are no consistency guarantees.

The above illustrates some of the consistency problems that such a feature could face now.
I also link [~till.rohrmann] and [~skonto] as he also mentioned that this feature could be

This message was sent by Atlassian JIRA

View raw message