flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bowen Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3089) State API Should Support Data Expiration (State TTL)
Date Fri, 12 Jan 2018 06:29:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323611#comment-16323611

Bowen Li commented on FLINK-3089:

[~xfournet] yes, supporting only TTL in processing time would be another important assumption
I'd like to make!

Implementing TTL in HeapStateBackend can be a bit tricky because of the number of timers as
you mentioned. W.r.t. development plan, I'm think we can probably add the interface and support
TTL in RocksDBStateBackend first. Then we can decide 1) whether making it a RocksDBStateBackend-only
feature, like incremental checkpointing 2) if we should support TTL in HeapStateBackend, how
to implement it

What do you think?

cc [~srichter] [~aljoscha]

> State API Should Support Data Expiration (State TTL)
> ----------------------------------------------------
>                 Key: FLINK-3089
>                 URL: https://issues.apache.org/jira/browse/FLINK-3089
>             Project: Flink
>          Issue Type: New Feature
>          Components: DataStream API, State Backends, Checkpointing
>            Reporter: Niels Basjes
>            Assignee: Bowen Li
> In some usecases (webanalytics) there is a need to have a state per visitor on a website
(i.e. keyBy(sessionid) ).
> At some point the visitor simply leaves and no longer creates new events (so a special
'end of session' event will not occur).
> The only way to determine that a visitor has left is by choosing a timeout, like "After
30 minutes no events we consider the visitor 'gone'".
> Only after this (chosen) timeout has expired should we discard this state.
> In the Trigger part of Windows we can set a timer and close/discard this kind of information.
But that introduces the buffering effect of the window (which in some scenarios is unwanted).
> What I would like is to be able to set a timeout on a specific state which I can update
> This makes it possible to create a map function that assigns the right value and that
discards the state automatically.

This message was sent by Atlassian JIRA

View raw message