samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shekar Tippur <ctip...@gmail.com>
Subject Re: Samza and sliding window
Date Mon, 29 Jun 2015 19:41:09 GMT
Yi,

My use case is more of the latter. Your explanation makes sense now. I was
also looking into Milinda's wiki. She has a section for Kafka
partition SimplePartitioner, which is simple enough as well.

Thanks for all the inputs. Let me see what I come up with while
implementing it.

- Shekar

On Mon, Jun 29, 2015 at 10:42 AM, Yi Pan <nickpan47@gmail.com> wrote:

> Hi, Shekar,
>
> First, I would like to clarify what you meant by sliding window: is it
> defined as windows with size N and advance step size of 1 (which means that
> windows overlap and each input message would contribute to multiple counts
> in different windows)? Or windows with size N and advance step size of N
> (i.e. each incoming message only contribute to one counter in a single
> window)?
>
> If your use case falls into the first category, you will need something
> more sophisticated as discussed in SAMZA-552. If your use case is the
> second one, there could be a simpler version of SAMZA-552 that you can go
> with:
>
> 1) Initiate a KV-store that uses the application name as the key
> 2) For each incoming message, look for the windows that the message by the
> application name
> 3) Update the counter and update the value in the KV-store based on the
> application name
> 4) Every 5 min when window() method is triggered, set all counters to zero
> (this can be done in a lazy way as well, by keeping the last reset
> timestamp in the record in the KV-store, keyed by application name. Then,
> resetting counter to zero can be done when next time the application
> counter is updated again)
>
> Hope that makes sense.
>
> -Yi
>
> On Mon, Jun 29, 2015 at 10:06 AM, Shekar Tippur <ctippur@gmail.com> wrote:
>
> > Benjamin,
> >
> > Thanks for the explanation. We dont have any specific partition scheme as
> > yet. We just have 2 topics - raw and processed and we use default
> > partitioning scheme.
> > Can you share any code snippet so I can understand it better?
> >
> > - Shekar
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message