beam-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksandr <aleksandr...@gmail.com>
Subject Re: Combining and ranking fired panes
Date Sun, 30 Oct 2016 19:12:44 GMT
Hello Nick,
I suppose that sliding window will suit your project
https://cloud.google.com/dataflow/model/windowing

Best regards
Aleksandr

вс, 30 окт. 2016 г. в 2:18, Nick Travers <n.e.travers@gmail.com>:

> Hi - I'm wondering how I'd go about combining results from repeated
> speculative firings of a window into a single, consolidated "pane".
>
> In my current use-case, I have items with scores arriving continuously,
> and I'm using hourly windows with speculative firings every minute, with
> the panes being accumulated. Every time a pane fires, I'd like to be able
> to (re-)rank the top ten items by score, descending.
>
> For example, if I have three items A, B and C arriving over the course of
> an hour with continuously changing scores, as follows:
>
> ------- window start
> (A, 1)
> (B, 2)
> (C, 3)
> ------- first firing (EARLY)
> (B, 4)
> ------- second firing (EARLY)
> (C, 0)
> ------- window closes (ON_TIME)
>
> then I'm hoping to see the following results when each pane is fired.
>
> After first firing:
> (C, 3)
> (B, 2)
> (A, 1)
>
> After second firing:
> (B, 4)
> (C, 3)
> (A, 1)
>
> On close of the window:
> (B, 4)
> (A, 1)
> (C, 0)
>
> I'm currently using Top.of().withoutDefaults() to give me the ranking, but
> this seems to only gives a single ON_TIME pane with all of the interim
> panes combined first and _then_ ranked on the score, so I get something
> like:
> (B, 4)
> (B, 4)
> (B, 2)
> (C, 3)
> (C, 3)
> (A, 1)
> (A, 1)
> (A, 1)
> (C, 0)
>
> Should I be using a different approach / pattern to continually rank each
> accumulated pane that is fired?
>
> Testing this with the DirectRunner, but I also see something similar when
> running with BlockingDataflowRunner.
>
> Thanks in advance!
> - nick
>

Mime
View raw message