flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Unregistering Managed State in Operator Backend
Date Tue, 24 Jan 2017 13:20:48 GMT
Just a bit of clarification, the OperatorState stuff is independent of
keyed state backends, i.e. even if you use RocksDB the operator state will
not be stored in RocksDB, only keyed state is stored there.

Right now, when an operator state (ListState) is empty we will still write
some meta data about that state. I think it should be easy to
change DefaultOperatorStateBackend to not write anything in case of an
empty state. What do you think, Stefan?

On Tue, 24 Jan 2017 at 12:12 Paris Carbone <parisc@kth.se> wrote:

> Sure Till,
>
> I would love to also make the patch but need to prioritize some other
> things these days.
> At least I will dig and see how complex this is regarding the different
> backends.
>
> I also have some follow-up questions, in case anybody has thought about
> these things already (or is simply interested):
>
> - Do you think it would make sense to automatically garbage collect empty
> states in general?
> - Shouldn't this happen already during snapshot compaction (in rocksdb)
> and would that violate any user assumptions in your view?
>
>
> > On 24 Jan 2017, at 11:44, Till Rohrmann <trohrmann@apache.org> wrote:
> >
> > Hi Paris,
> >
> > if there is no such issue open, then please open one so that we can track
> > the issue. If you have time to work on that even better :-)
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 24, 2017 at 10:25 AM, Paris Carbone <parisc@kth.se> wrote:
> >
> >> Any thoughts/plans?
> >> So should I open a Jira and add this?
> >>
> >> Paris
> >>
> >> On Jan 21, 2017, at 5:17 PM, Paris Carbone <parisc@kth.se<mailto:
> parisc@
> >> kth.se>> wrote:
> >>
> >> Thank you for the answer Ufuk!
> >>
> >> To elaborate a bit more, I am not using keyed state, it would be indeed
> >> tricky in that case to discard everything.
> >>
> >> I need that for operator state, in my loop fault tolerance PR [1].  The
> >> idea is to tag a ListState (upstream log) per snapshot id.
> >> When a concurent snapshot is commited I want to simply remove everything
> >> related to that ListState (not just clear it). This would also
> eliminate a
> >> memory leak in case many empty logs accumulate in time (and thus state
> >> entries).
> >> Hope that makes it a bit more clear. Thanks again :)
> >>
> >> Paris
> >>
> >> [1] https://github.com/apache/flink/pull/1668
> >>
> >>
> >> On 21 Jan 2017, at 17:10, Ufuk Celebi <uce@apache.org<mailto:uce@
> >> apache.org>> wrote:
> >>
> >> Hey Paris!
> >>
> >> As far as I know it's not possible at the moment and not planned. Does
> >> not sound to hard to add though. @Stefan: correct?
> >>
> >> You can currently only clear the state via #clear in the scope of the
> >> key for keyed state or the whole operator when used with operator
> >> state. In case of keyed state it's indeed hard to clear all state for
> >> operator state it's slightly better. I'm curious what your use case
> >> is?
> >>
> >> - Ufuk
> >>
> >>
> >> On Fri, Jan 20, 2017 at 5:59 PM, Paris Carbone <parisc@kth.se<mailto:
> >> parisc@kth.se>> wrote:
> >> Hi folks,
> >>
> >> I have a little question regarding the managed store operator backend,
> in
> >> case someone can help.
> >>
> >> Is there some convenient way (planned or under development) to
> completely
> >> unregister a state entry (e.g. a ListState) with a given id from the
> >> backend?
> >> It is fairly easy to register new states dynamically (i.e. with
> >> getOperatorState(...)), why not being able to discard it as well?
> >>
> >> I would find this feature extremely convenient to a fault tolerance
> >> related PR I am working on but I can think of many use cases that might
> >> need it.
> >>
> >>
> >> Paris
> >>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message