flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Santos <dsan...@cryptolab.net>
Subject Flink Time Window State
Date Thu, 03 Nov 2016 18:43:16 GMT
Hello,

I have some question that has been bugging me.
Let's say we have a Kafka Source.
Checkpoint is enabled, with a period of 5 seconds.
We have a FSBackend ( Hadoop ).

Now imagine we have a window a tumbling of 10 Minutes.

For simplicity we are going to say that we are counting all elements
arrinving in 10 Minutes. Something like this.

class Count extends FoldFunction[Event, Long] {

    override def fold(accumulator: Long, value: Event): Long =  {
      accumulator + 1
    }

}

So we have

source.
      window(<Tumbling>).
      apply(0, Count(), WindowFunction())

In the first 2 Minutes arrives 10 events, then we stop the
stream/task/job or it fails and then it is restarted, what will be the
state of the fold function ?
Will it be 10 and it will resume from there ? Or will it be 0 ?

It is kinda important to know because imagine we have a Window of 1 day.
And the job fails mid day. How will it resume ?

Best Regards


Mime
View raw message