flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Thoughts About Object Reuse and Collection Execution
Date Sat, 28 Feb 2015 06:57:13 GMT
This is going to have profound performance implications if this is the only
path for iteration.



On Fri, Feb 27, 2015 at 10:58 PM, Stephan Ewen <sewen@apache.org> wrote:

> I vote to have the key extractor return a new value each time. That means
> that objects are not reused everywhere where it is possible, but still in
> most places, which still helps.
>
> What still puzzles me: I thought that the collection execution stores
> copies of the returned records by default (reuse safe mode).
> Am 27.02.2015 15:36 schrieb "Aljoscha Krettek" <aljoscha@apache.org>:
>
> > Hello Nation of Flink,
> > while figuring out this bug:
> > https://issues.apache.org/jira/browse/FLINK-1569
> > I came upon some difficulties. The problem is that the
> > KeyExtractorMappers always
> > return the same tuple. This is problematic, since Collection Execution
> > does simply store the returned values in a list. These elements are
> > not copied before they are stored when object reuse is enabled.
> > Therefore, the whole list will contain only that one reused element.
> >
> > I see two options for solving this:
> > 1. Change KeyExtractorMappers to always return a new tuple, thereby
> > making object-reuse mode in cluster execution useless for key
> > extractors.
> >
> > 2. Change collection execution mapper to always make copies of the
> > returned elements. This would make object-reuse in collection
> > execution pretty much obsolete, IMHO.
> >
> > How should we proceed with this?
> >
> > Cheers,
> > Aljoscha
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message