flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Golubets <dgolub...@gmail.com>
Subject Re: Can serialization be disabled between chains?
Date Mon, 16 Jan 2017 13:32:18 GMT
First issue is not a problem with idiomatic Scala - we make all our data
objects immutable.
Second.. yeah, I guess it makes sense.
Thanks for clarification.

Best regards,
Dmitry

On Mon, Jan 16, 2017 at 1:27 PM, Fabian Hueske <fhueske@gmail.com> wrote:

> One of the reasons is to ensure that data cannot be modified after it left
> a thread.
> A function that emits the same object several times (in order to reduce
> object creation & GC) might accidentally modify emitted records if they
> would be put as object in a queue.
> Moreover, it is easier to control the memory consumption if data is
> serialized into a fixed number of buffers instead of being put on the JVM
> heap.
>
> Best, Fabian
>
> 2017-01-16 14:21 GMT+01:00 Dmitry Golubets <dgolubets@gmail.com>:
>
>> Hi Ufuk,
>>
>> Do you know what's the reason for serialization of data between different
>> threads?
>>
>> Also, thanks for the link!
>>
>> Best regards,
>> Dmitry
>>
>> On Mon, Jan 16, 2017 at 1:07 PM, Ufuk Celebi <uce@apache.org> wrote:
>>
>>> Hey Dmitry,
>>>
>>> this is not possible if I'm understanding you correctly.
>>>
>>> A task chain is executed by a single task thread and hence it is not
>>> possible to continue processing before the record "leaves" the thread,
>>> which only happens when the next task thread or the network stack
>>> consumes it.
>>>
>>> Hand over between chained tasks happens without serialization. Only
>>> data between different task threads is serialized.
>>>
>>> Depending on your use case the newly introduced async I/O feature
>>> might be worth a look (will be part of the upcoming 1.2 release):
>>> https://github.com/apache/flink/pull/2629
>>>
>>
>>
>

Mime
View raw message