flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt <dromitl...@gmail.com>
Subject Re: Updating a Tumbling Window every second?
Date Fri, 16 Dec 2016 11:58:42 GMT
I have reduced the problem to a simple image [1].

Those shown on the image are the streams I have, and the problem now is how
to create a custom window assigner such that objects in B that *don't share*
elements in A, are put together in the same window.

Why? Because in order to create elements in C (triangles), I have to
process n *independent* elements of B (n=2 in the example).

Maybe there's a better or simpler way to do this. Any idea is appreciated!

Regards,
Matt

[1] http://i.imgur.com/dG5AkJy.png

On Thu, Dec 15, 2016 at 3:22 AM, Matt <dromitlabs@gmail.com> wrote:

> Hello,
>
> I have a rather simple problem with a difficult explanation...
>
> I have 3 streams, one of objects of class A (stream A), one of class B
> (stream B) and one of class C (stream C). The elements of A are generated
> at a rate of about 3 times every second. Elements of type B encapsulates
> some key features of the stream A (like the number of elements of A in the
> window) during the last 30 seconds (tumbling window 30s). Finally, the
> elements of type C contains statistics (for simplicity let's say the
> average of elements processed by each element in B) of the last 3 elements
> in B and are produced on every new element of B (count window 3, 1).
>
> Illustrative example, () and [] denotes windows:
>
> ... [a10 a9 a8] [a7 a6] [a5 a4] [a3 a2 a1]
> ... (b4 [b3 b2) b1]
> ... [c2] [c1]
>
> This works fine, except for a dashboard that depends on the elements of C
> to be updated, and 30s is way too big of a delay. I thought I could change
> the tumbling window for a sliding window of size 30s and a slide of 1s, but
> this doesn't work.
>
> If I use a sliding window to create elements of B as mentioned, each count
> window would contain 3 elements of B, and I would get one element of C
> every second as intended, but those elements in B encapsulates almost the
> same elements of A. This results in stats that are wrong.
>
> For instance, c1 may have the average of b1, b2 and b3. But b1, b2 and b3
> share most of the elements from stream A.
>
> Question: is there any way to create a count window with the last 3
> elements of B that would have gone into the same tumbling window, not with
> the last 3 consecutive elements?
>
> I hope the problem is clear, don't hesitate to ask for further
> clarification!
>
> Regards,
> Matt
>
>

Mime
View raw message