beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Knowles (JIRA)" <>
Subject [jira] [Commented] (BEAM-1517) Garbage collect user state in Flink Runner
Date Wed, 22 Feb 2017 22:15:45 GMT


Kenneth Knowles commented on BEAM-1517:

[~aljoscha] that makes sense to me. In both the direct runner and Dataflow runner it was easy
to implement "under the hood" and kept the system timers totally separate from the user timers,
which is why I didn't produce a shared utility like I usually try to do. It shouldn't be much
work to make such a `StatefulDoFnRunner` and will be useful in general.

> Garbage collect user state in Flink Runner
> ------------------------------------------
>                 Key: BEAM-1517
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
> User facing state/timers in Beam are bound to the key/window of the data. Right now,
the Flink Runner does not clean up user state when the watermark passes the GC horizon for
the state associated with a given window.
> Neither {{StateInternals}} nor the Flink state API support discarding state for a whole
namespace (which is the window in this case) so we might have to manually set a GC timer for
each window/key combination, as is done in the {{ReduceFnRunner}}. For this we have to know
all states a user can possibly use, which we can get from the {{DoFn}} signature.

This message was sent by Atlassian JIRA

View raw message