samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shekar Tippur <ctip...@gmail.com>
Subject Re: Samza and sliding window
Date Fri, 26 Jun 2015 17:53:50 GMT
Yan,


*What do you mean by "a local cache"? Is it a db like MySQL, something
likeRocksDB, or even just in-memory?*

Local cache as in Redis



*When you say "another topic", is this the topic consumed by the same
Samzajob as your 5-minutes-job, or in a separate job? What is the
relationbetween the topic and the application name*

We dont have a 5 min job. All we have now is a stream of events coming from
a bunch of applications. All these land on a raw kafka topic. The stream
data has application name. I want to create a job that takes incoming
stream and group it by application name and count the number of events we
get in a 5 min sliding window.

- Shekar

On Fri, Jun 26, 2015 at 10:29 AM, Yan Fang <yanfang724@gmail.com> wrote:

> Hi Shekar,
>
> Need a little more clarification.
>
> What do you mean by "a local cache"? Is it a db like MySQL, something like
> RocksDB, or even just in-memory?
>
> When you say "another topic", is this the topic consumed by the same Samza
> job as your 5-minutes-job, or in a separate job? What is the relation
> between the topic and the application name?
>
> Thanks,
>
> Fang, Yan
> yanfang724@gmail.com
>
> On Fri, Jun 26, 2015 at 1:08 AM, Shekar Tippur <ctippur@gmail.com> wrote:
>
> > Hello,
> > My apologies if I have raised it earlier.
> > Here is the use case:
> > I have a stream that is partitioned based on application name. I want to
> be
> > able to count hte number of events happening for that particular
> > application in the past 5 minutes (sliding window) and update either
> > another topic or a local cache.
> >
> > Is this possible via 0.9 version of Samza?
> > If not, what is the easiest way to achieve this?
> >
> > - Shekar
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message