flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: List State in RichWindowFunction leads to RocksDb memory leak
Date Fri, 24 Feb 2017 17:09:40 GMT
Hi Seth,
yes, this is a thorny problem but I actually see one additional possible
solution (that will, however, break other possible use cases.

First, regarding your solution 1):
We are working on adding this for ProcessWindowFunction:
https://issues.apache.org/jira/browse/FLINK-4953. ProcessWindowFunction is
a more powerful interface that allows querying more context about a window
firing. This will replace the current WindowFunction in the future.
Unfortunately this doesn't help you with your current situation.

About 2), do you have any idea why the state is getting so big? Do you see
the state of the second (count) window operator growing very large? The
problem with count windows is that they never get garbage collected if you
don't reach the count required by a Trigger. If you have an evolving key
space this means that your state will possibly grow forever.

The third solution that I can think of is to make state of a window
function implicitly scoped to both the key and window. Right now, state is
"global" across time and only scoped to a key. If we also scoped to the
window we could keep track of all state created for a window and then
garbage collect that once the window expires. This, however, will break
things for people that rely on this state being global. I'll bring this up
on the dev mailing list to see what people think about it? Are you also
following that one? So that you could chime in.

I'm afraid I don't have a good solution for you before Flink 1.3 come out,
other than writing your own custom operator or copying the WindowOperator.

What do you think?

Best,
Aljoscha

On Thu, 23 Feb 2017 at 16:12 Seth Wiesman <swiesman@mediamath.com> wrote:

> I am working on a program that uses a complex window and have run into
> some issues. It is a 1 hour window with 7 days allowed lateness including a
> custom trigger that gives us intermediate results every 5 minutes of
> processing time until the end of 7 days event time when a final fire is
> triggered and the window is purged. The window functions are an incremental
> reduce function as well as a RichWindowFunction which performs some final
> computation before outputting each result. I am building up a collection of
> objects so each time the RichWindowFunction is run I want to take a diff
> with the previous set to only output elements that have changed.
>
>
>
> Example:
>
>
>
> //In reality I am working with more complex objects than ints.
>
> class CustomRichWindowFunction extends RichWindowRunction[Collection[Int],
> Int, Key, TimeWindow] {
>
>                 @transient var state: ListState[Int]= _
>
>
>
>                 override def open(parameters: Configuration): Unit = {
>
>                 val info = new ListStateDescriptor(“previous”,
> createTypeInformation[Int])
>
>                 state = getRuntimeContext.getListState(info)
>
> }
>
>
>
> override def apply(key: Key, window: TimeWindow, input:
> Iterable[Collection[Int]], out: Collector[Int]): Unit = {
>
>                 val current = input.iterator.next
>
>                 val previous = state.get().iterator.asScala.toSet
>
>                 previous.clear()
>
>
>
> for (elem <- current) {
>                 if (!previous.contains(elem)) {
>
>                 out.collect(elem)
>
> }
>
>
>
> state.add(elem) //store for the next run
>
> }
>
> }
>
> }
>
>
>
> The issue with this is that it causes a memory leak with RocksDb. When the
> WindowOperator executes clearAllState
> <https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/windowing/WindowOperator.java#L527>
> at the end of the windows lifetime it does not clear the ListState or any
> other type of custom partitioned state that may have been created. This
> causes my state size to grow indefinitely. It appears to me that a
> RichWindowFunction should have a clear method, similar to triggers, for
> cleaning up state when the window is destroyed.
>
>
>
> Barring that I can envision two ways of solving this problem but have come
> short of successfully implementing them.
>
>
>
> 1)       If I had access to the watermark from within apply I could use
> that in conjuction with the TimeWindow passed in and be able to tell if it
> was my final EventTimeTimer that had gone off allowing me to manually clear
> the state:
>
>
>
> ie: if (watermark < window.getEnd  + Time.days(7).getMilliseconds) {
>
>                                 state.add(elem) // I know that my window
> is not finished so I can store state.
>
>                      }
>
>
>
> 2)       Pass my elements into a second window with a count trigger of 1
> and a custom evictor which always keeps the two most recent elements and
> then do my diff there.
>
> Semantically this seems to work but in practice it causes my checkpoint
> times to grow 10x and I seem to fail every 5th-7th checkpoint.
>
>
>
> I am curious if anyone here has any ideas of what I might be able to do to
> solve this problem.
>
>
>
> Thank you,
>
>
>
> Seth Wiesman
>

Mime
View raw message