flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5991) Expose Broadcast Operator State through public APIs
Date Tue, 04 Apr 2017 09:17:41 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954881#comment-15954881

ASF GitHub Bot commented on FLINK-5991:

Github user StefanRRichter commented on the issue:

    Very good work and thanks for improving the documentation. I like the update. From what
I have seen in the past, some user have mistaken the list-nature of the operator state and
simply dumped lots of small elements in the list, that should not actually be the unit of
repartitioning and sometimes even logically belonged together. I wonder if the different semantics
in list state between the operator state and the keyed state can be confusing and error-prone
for users and what we could do about this? A method called `getListState` might be a step
in the wrong direction.
    Besides this, +1 from me.

> Expose Broadcast Operator State through public APIs
> ---------------------------------------------------
>                 Key: FLINK-5991
>                 URL: https://issues.apache.org/jira/browse/FLINK-5991
>             Project: Flink
>          Issue Type: New Feature
>          Components: DataStream API, State Backends, Checkpointing
>            Reporter: Tzu-Li (Gordon) Tai
>            Assignee: Tzu-Li (Gordon) Tai
>             Fix For: 1.3.0
> The broadcast operator state functionality was added in FLINK-5265, it just hasn't been
exposed through any public APIs yet.
> Currently, we have 2 streaming connector features for 1.3 that are pending on broadcast
state: rescalable Kinesis / Kafka consumers with shard / partition discovery (FLINK-4821 &
FLINK-4022). We should consider exposing broadcast state for the 1.3 release also.
> This JIRA also serves the purpose to discuss how we want to expose it.
> To initiate the discussion, I propose:
> 1. For the more powerful {{CheckpointedFunction}}, add the following to the {{OperatorStateStore}}
> {code}
> <S> ListState<S> getBroadcastOperatorState(ListStateDescriptor<S> stateDescriptor);
> <T extends Serializable> ListState<T> getBroadcastSerializableListState(String
> {code}
> 2. For a simpler {{ListCheckpointed}} variant, we probably should have a separate {{BroadcastListCheckpointed}}
> Extending {{ListCheckpointed}} to let the user define either the list state type of either
{{PARTITIONABLE}} or {{BROADCAST}} might also be possible, if we can rely on a contract that
the value doesn't change. Or we expose a defining method (e.g. {{getListStateType()}}) that
is called only once in the operator. This would break user code, but can be considered because
it is marked as {{PublicEvolving}}.

This message was sent by Atlassian JIRA

View raw message