beam-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankur Goenka <goe...@google.com>
Subject Re: CoGroupByKey only on window end.
Date Mon, 01 Oct 2018 21:17:40 GMT
CoGBK and GBK need consistent windowing in PCollection. In your case, a
custom solution is needed.
Here is another way which only need pipeline orchestration and might be
simpler.

Lets say you have pcollection A with 15 min window and pcollection B with 1
min window
Step 1: GBK pcollection A for 15 min window.
Step 2: Read GBK A and re-emit same value for 15 x 1 min windows. Lets call
this pcollection A'
Step 3: Now A' and B have same window. Do CoGBK on A' and B.
...

Thanks,
Ankur

On Mon, Oct 1, 2018 at 9:52 AM Akshay Balwally <abalwally@lyft.com> wrote:

> Hi everyone,
>
> I would like to use a CoGroupByKey statement on unevenly windowed streams
> (one of size 15 minutes, one of size 1 minute). As I understand it,
> CoGroupByKey groups first by key, then by window. But of course since the
> windows are not the same, my CoGroupByKey does not successfully join the
> streams.
>
> One idea I had is to extend CoGroupByKey to make some
> "CoGroupByKeyWindowEnd", that groups first by key, then by window.end. I
> just wanted to check first- is there a better way to do this? Or something
> natively supported by Beam?
>
> Thanks,
> Akshay
> --
> Akshay Balwally
> Software Engineer
> 9372716469 |
>
> <https://www.lyft.com/>
>

Mime
View raw message