flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eduardo Winpenny Tejedor <eduardo.winpe...@gmail.com>
Subject Re: Process stream multiple time with different KeyBy
Date Mon, 17 Feb 2020 20:07:10 GMT
Hi Sebastien,

Without being entirely sure of what's your use case/end goal I'll tell
you (some of) the options Flink provides you for defining a flow.

If your use case is to apply the same rule to each of your "swimlanes"
of data (one with category=foo AND subcategory=bar, another with
category=foo and another with category=bar) you can do this by
implementing your own org.apache.flink.api.java.functions.KeySelector
function for the keyBy function. You'll just need to return a
different key for each of your rules and the data will separate to the
appropriate "swimlane".

If your use case is to apply different rules to each swimlane then you
can write a ProcessFunction that redirects elements to different *side
outputs*. You can then apply different operations to each side output.

Your application could get tricky to evolve IF the number of swimlanes
or the operators are meant to change over time, you'd have to be
careful how the existing state fits into your new flows.


On Mon, Feb 17, 2020 at 7:06 PM Lehuede sebastien <lehuede.s@gmail.com> wrote:
> Hi all,
> I'm currently working on a Flink Application where I match events against a set of rules.
At the beginning I wanted to dynamically create streams following the category of events (Event
are JSON formatted and I've a field like "category":"foo" in each event) but I'm stuck by
the impossibility to create streams at runtime.
> So, one of the solution for me is to create a single Kafka topic and then use the "KeyBy"
operator to match events with "category":"foo" against rules also containing "category":"foo"
in rule specification.
> Now I have some cases where events and rules have one category and one subcategory. At
this point I'm not sure about the "KeyBy" operator behavior.
> Example :
> Events have : "category":"foo" AND "subcategory":"bar"
> Rule1 specification has : "category":"foo" AND "subcategory":"bar"
> Rule2 specification has : "category':"foo"
> Rule3 specification has : "category":"bar"
> In this case, my events need to be match against Rule1, Rule2 and Rule3.
> If I'm right, if I apply a multiple key "KeyBy()" with "category" and "subcategory" fields
and then apply two single key "KeyBy()" with "category" field, my events will be consumed
by the first "KeyBy()" operator and no events will be streamed in the operators after ?
> Is there any way to process the same stream one time for multi key KeyBy() and another
time for single key KeyBy() ?
> Thanks !
> Sébastien.

View raw message