flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Can serialization be disabled between chains?
Date Mon, 16 Jan 2017 16:55:50 GMT
+1 to what Fabian said. Regarding the memory consumption: Flink's back
pressure mechanisms also depends on this, because the availability of
(network) buffers determines how fast operator can produce data. If no
buffers are available, the producing operator will slow down.

On Mon, Jan 16, 2017 at 2:32 PM, Dmitry Golubets <dgolubets@gmail.com> wrote:
> First issue is not a problem with idiomatic Scala - we make all our data
> objects immutable.
> Second.. yeah, I guess it makes sense.
> Thanks for clarification.
>
> Best regards,
> Dmitry
>
> On Mon, Jan 16, 2017 at 1:27 PM, Fabian Hueske <fhueske@gmail.com> wrote:
>>
>> One of the reasons is to ensure that data cannot be modified after it left
>> a thread.
>> A function that emits the same object several times (in order to reduce
>> object creation & GC) might accidentally modify emitted records if they
>> would be put as object in a queue.
>> Moreover, it is easier to control the memory consumption if data is
>> serialized into a fixed number of buffers instead of being put on the JVM
>> heap.
>>
>> Best, Fabian
>>
>> 2017-01-16 14:21 GMT+01:00 Dmitry Golubets <dgolubets@gmail.com>:
>>>
>>> Hi Ufuk,
>>>
>>> Do you know what's the reason for serialization of data between different
>>> threads?
>>>
>>> Also, thanks for the link!
>>>
>>> Best regards,
>>> Dmitry
>>>
>>> On Mon, Jan 16, 2017 at 1:07 PM, Ufuk Celebi <uce@apache.org> wrote:
>>>>
>>>> Hey Dmitry,
>>>>
>>>> this is not possible if I'm understanding you correctly.
>>>>
>>>> A task chain is executed by a single task thread and hence it is not
>>>> possible to continue processing before the record "leaves" the thread,
>>>> which only happens when the next task thread or the network stack
>>>> consumes it.
>>>>
>>>> Hand over between chained tasks happens without serialization. Only
>>>> data between different task threads is serialized.
>>>>
>>>> Depending on your use case the newly introduced async I/O feature
>>>> might be worth a look (will be part of the upcoming 1.2 release):
>>>> https://github.com/apache/flink/pull/2629
>>>
>>>
>>
>

Mime
View raw message