flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elias Levy <fearsome.lucid...@gmail.com>
Subject Re: Re-keying / sub-keying a stream without repartitioning
Date Tue, 25 Apr 2017 21:32:52 GMT

On Fri, Apr 21, 2017 at 10:15 PM, Elias Levy <fearsome.lucidity@gmail.com>

> This is something that has come up before on the list, but in a different
> context.  I have a need to rekey a stream but would prefer the stream to
> not be repartitioned.  There is no gain to repartitioning, as the new
> partition key is a composite of the stream key, going from a key of A to a
> key of (A, B), so all values for the resulting streams are already being
> rerouted to the same node and repartitioning them to other nodes would
> simply generate unnecessary network traffic and serde overhead.
> Unlike previous use cases, I am not trying to perform aggregate
> operations.  Instead I am executing CEP patterns.  Some patterns apply the
> the stream keyed by A and some on the stream keyed by (A,B).
> The API does not appear to have an obvious solution to this situation.
> keyBy() will repartition and there is isn't something like subKey() to
> subpartion a stream without repartitioning (e.g. keyBy(A).subKey(B)).
> I suppose I could accomplish it by using partitionCustom(), ignoring the
> second element in the key, and delegating to the default partitioner
> passing it only the first element, thus resulting in no change of task
> assignment.
> Thoughts?

View raw message