flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Eden <martineden...@gmail.com>
Subject dynamically partitioned stream
Date Thu, 31 Aug 2017 08:10:23 GMT
Hi all,

I am trying to implement the following using Flink:

I have 2 input message streams:

1. Data Stream:
KEY VALUE TIME
.
.
.
C      V6        6
B      V6        6
A      V5        5
A      V4        4
C      V3        3
A      V3        3
B      V3        3
B      V2        2
A      V1        1

2. Control Stream:
Lambda  ArgumentKeys TIME
.
.
.
f2            [A, C]                 4
f1            [A, B, C]            1

I want to apply the lambdas coming in the control stream to the selection
of keys that are coming in the data stream.

Since we have 2 streams I naturally thought of connecting them using
.connect. For this I need to key both of them by a certain criteria. And
here lies the problem, how can I make sure the messages with keys A,B,C
specified in the control stream end up in the same task as well as the
control message (f1, [A, B, C]) itself. Basically I don't know how to key
by to achieve this.

I suspect a custom partitioner is required that partitions the data stream
based on the messages in the control stream? Is this even possible?

Any suggestions welcomed!

Thanks,
M

Mime
View raw message