flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kirschnick, Johannes" <johannes.kirschn...@tu-berlin.de>
Subject Operating on Serialized Data
Date Tue, 24 Feb 2015 10:13:45 GMT
Hi list,

I have a general question on as to whether it's possible to significantly speed up the processing
by cutting down on the serialization costs during iterations.

The basic setup that I have are a couple of vectors that are repeatedly mutated (added &
multiplied) as part of an iterative run within a reducer.

A vector is basically "just" an array of doubles - all of the same size.

I noticed during simple profiling that roughly 50% of the execution time is spent on serializing
the data in using the com.esotericsoftware.kryo.serializers.DefaultArraySerializers in Kryo.

I know that any custom operation would would varant custom processing, but given the serialization
contributes such a large amount of processing time to the overall runtime it might very well
be worthwhile

Is that currently exposed in any fashion to the user code, or are there some hooks I could
look into?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message