flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Shared Object Instance over different RichMapFunctions
Date Mon, 09 Jan 2017 15:32:35 GMT
Hi,
Flink will serialise uses functions when distributing work across the
cluster. Therefore your shared objects will not be shared objects anymore
once your program executes. You will still get object sharing because only
one instance of your function is used to process data on one parallel
instance of an operation.

Cheers,
Aljoscha

On Wed, 4 Jan 2017 at 21:05 Duck <kcud@protonmail.com> wrote:

> Hi there,
>
> I was wondering on how my caching object, would behave in the given
> scenario below.
>
> 1) I create an instance of an object that performs lookups to an external
> resource, and caches results.
> 2) I have a DataStream that i perform a map function on (with a custom
> RichMapFunction)
> 3) I have a second DataStream that i perform a map function on (with a
> custom RichMapFunction)
> 4) I set the Job parallelism to 2.
>
> Will the multiple usage, along with parallelism duplicate my object in any
> way, or will it still behave as a "shared object instance". Wondering,
> since this "cacheloader" will talk to external resources, i do not want it
> to be say duplicated due to performance reasons on the external resource.
>
> Sent from ProtonMail <https://protonmail.com>, Swiss-based encrypted
> email.
>
>
>

Mime
View raw message