flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From antonio saldivar <ansal...@gmail.com>
Subject Re: Flink Rebalance
Date Fri, 10 Aug 2018 10:50:00 GMT
Hi Fabian

Thank you, yes there are just map functions, i will do it that way with
methods to get it faster

On Fri, Aug 10, 2018, 5:58 AM Fabian Hueske <fhueske@gmail.com> wrote:

> Hi,
>
> Elias and Paul have good points.
> I think the performance degradation is mostly to the lack of function
> chaining in the rebalance case.
>
> If all steps are just map functions, they can be chained in the
> no-rebalance case.
> That means, records are passed via function calls.
> If you add rebalancing, records will be passed between map functions via
> serialization, network transfer, and deserialization.
> This is of course much more expensive than calling a method.
>
> Best, Fabian
>
> 2018-08-10 4:25 GMT+02:00 Paul Lam <paullin3280@gmail.com>:
>
>> Hi Antonio,
>>
>> AFAIK, there are two reasons for this:
>>
>> 1. Rebalancing itself brings latency because it takes time to
>> redistribute the elements.
>> 2. Rebalancing also messes up the order in the Kafka topic partitions,
>> and often makes a event-time window wait longer to trigger in case you’re
>> using event time characteristic.
>>
>> Best Regards,
>> Paul Lam
>>
>>
>>
>> 在 2018年8月10日,05:49,antonio saldivar <ansale10@gmail.com> 写道:
>>
>> Hello
>>
>> Sending ~450 elements per second ( the values are in milliseconds start
>> to end)
>> I went from:
>> with Rebalance
>> *+------------+*
>> *| **AVGWINDOW ** |*
>> *+------------+*
>> *| *32131.0853  * |*
>> *+------------+*
>>
>> to this without rebalance
>>
>> *+------------+*
>> *| **AVGWINDOW ** |*
>> *+------------+*
>> *| *70.2077   * |*
>> *+------------+*
>>
>> El jue., 9 ago. 2018 a las 17:42, Elias Levy (<
>> fearsome.lucidity@gmail.com>) escribió:
>>
>>> What do you consider a lot of latency?  The rebalance will require
>>> serializing / deserializing the data as it gets distributed.  Depending on
>>> the complexity of your records and the efficiency of your serializers, that
>>> could have a significant impact on your performance.
>>>
>>> On Thu, Aug 9, 2018 at 2:14 PM antonio saldivar <ansale10@gmail.com>
>>> wrote:
>>>
>>>> Hello
>>>>
>>>> Does anyone know why when I add "rebalance()" to my .map steps is
>>>> adding a lot of latency rather than not having rebalance.
>>>>
>>>>
>>>> I have kafka partitions in my topic 44 and 44 flink task manager
>>>>
>>>> execution plan looks like this when I add rebalance but it is adding a
>>>> lot of latency
>>>>
>>>> kafka-src -> rebalance -> step1 -> rebalance ->step2->rebalance
->
>>>> kafka-sink
>>>>
>>>> Thank you
>>>> regards
>>>>
>>>>
>>
>

Mime
View raw message