beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingsong Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-1517) Garbage collect user state in Flink Runner
Date Tue, 21 Feb 2017 13:51:44 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15876005#comment-15876005
] 

Jingsong Lee commented on BEAM-1517:
------------------------------------

Is it appropriate for the user to do the work of GC?
Just like this:
{code}
  @ProcessElement
  public void process(
      ProcessContext c,
      BoundedWindow window,
      @StateId(stateId) ValueState<Integer> state,
      @TimerId("GcTimer") Timer timer) {
    Instant maxTimestamp = window.maxTimestamp();
    long allowedLateness = 10 * 1000;
    Instant gcTime = maxTimestamp.plus(allowedLateness);
    //Can Timer have a getCurrentTime interface?
    Instant currentTime = new Instant();
    if (gcTime.isBefore(currentTime)) {
      c.sideOutput(lateDataTag, c.element());
    } else {
      timer.set(gcTime);
      // user logical
      // ....
    }
  }
  @OnTimer("GcTimer")
  public void gc(
      OnTimerContext context,
      @StateId(stateId) ValueState<Integer> state) {
    state.clear();
  }
{code}


> Garbage collect user state in Flink Runner
> ------------------------------------------
>
>                 Key: BEAM-1517
>                 URL: https://issues.apache.org/jira/browse/BEAM-1517
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>
> User facing state/timers in Beam are bound to the key/window of the data. Right now,
the Flink Runner does not clean up user state when the watermark passes the GC horizon for
the state associated with a given window.
> Neither {{StateInternals}} nor the Flink state API support discarding state for a whole
namespace (which is the window in this case) so we might have to manually set a GC timer for
each window/key combination, as is done in the {{ReduceFnRunner}}. For this we have to know
all states a user can possibly use, which we can get from the {{DoFn}} signature.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message