flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Протченко Алексей <tverdy...@mail.ru>
Subject Re[2]: POJO serialization vs immutability
Date Mon, 07 Oct 2019 14:17:45 GMT

Sorry, but what about immutability in common? Seems like there is no way to have normal immutable
chunks inside the stream (but mutable chunks inside stream seem to be some kind of «code
smell»). Or I’m just missing something?
Best regards,
>Понедельник, 7 октября 2019, 16:13 +03:00 от Jan Lukavský <je.ik@seznam.cz>:
>Exactly. And that's why it is good for mutable data, because they are not suited for keys
>On 10/7/19 2:58 PM, Chesnay Schepler wrote:
>>The default hashCode implementation is effectively random and not suited for keys
as they may not be routed to the same instance.
>>On 07/10/2019 14:54, Jan Lukavský wrote:
>>>Hi Stephen,
>>>I found a very nice article [1], which might help you solve the issues you are
concerned about. The elegant solution to this problem might be summarized as "do not implement
equals() and hashCode() for POJO types, use Object's default implementation". I'm not 100%
sure that this will not have any negative impacts on some other Flink components, but I _suppose_
it should not (someone might correct me if I'm wrong).
>>>[1]  http://web.mit.edu/6.031/www/sp17/classes/15-equality/
>>>On 10/7/19 1:37 PM, Chesnay Schepler wrote:
>>>>This question should only be relevant for cases where POJOs are used as keys,
in which case they  must not return a class-constant nor effectively-random value, as this
would break the hash partitioning.
>>>>This is somewhat alluded to in the  keyBy() documentation , but could be clarified.
>>>>It is in any case heavily discouraged to modify objects after they have been
emitted from a function; the mutability of POJOs is hence usually not a problem.
>>>>On 02/10/2019 14:17, Stephen Connolly wrote:
>>>>>I notice  https://ci.apache.org/projects/flink/flink-docs-stable/dev/types_serialization.html#rules-for-pojo-types
says that all non-transient fields need a setter.
>>>>>That means that the fields cannot be final.
>>>>>That means that the hashCode() should probably just return a constant
value (otherwise an object could be mutated and then lost from a hash-based collection.
>>>>>Is it really the case that we have to either register a serializer or
abandon immutability and consequently force hashCode to be a constant value?
>>>>>What are the recommended implementation patterns for the POJOs used in
a topology
Алексей Протченко
View raw message