flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1005) Add different mutable-object modes to runtime
Date Mon, 21 Jul 2014 11:48:38 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068430#comment-14068430

ASF GitHub Bot commented on FLINK-1005:

GitHub user StephanEwen reopened a pull request:


    [FLINK-1005] Add immutable object mode utils and enable it for GroupReduce

    This pull request adds the immutable object mode basics and implements it for the GroupReduce.
This allows user code to keep references to values, without problems that the contents gets
overwritten for mutable types.
    I vote to make this the default mode in future versions.
    Code like this used to give unexpected results in the past, because of heavy object reuse
in the runtime. With *immutable object mode*, it now gives expected results.
    List<Tuple2<StringValue, IntValue>> all = new ArrayList<Tuple2<StringValue,IntValue>>();
    while (values.hasNext()) {
    Tuple2<StringValue, IntValue> result = all.get(0);

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink mutable_immutable

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #66
commit 0f5c049cc3f91530ca36816d9742c2a04234989f
Author: Stephan Ewen <sewen@apache.org>
Date:   2014-07-07T17:39:24Z

    [FLINK-1005] Extend TypeSerializer interface to handle non-mutable object deserialization
more efficiently.

commit 618e0b3d7c72b67a555a1e8db6925a7d5d0b4c92
Author: Stephan Ewen <sewen@apache.org>
Date:   2014-07-08T09:32:09Z

    [FLINK-1005] Add non-object reusing variants of key-grouped iterator.
    Clean minor javadoc errors.

commit 9703593a4b35b148489e840875b46b45e26bf966
Author: Stephan Ewen <sewen@apache.org>
Date:   2014-07-08T10:19:43Z

    [FLINK-1005] Make GroupReduce configurable to use either mutable or immutable object mode


> Add different mutable-object modes to runtime
> ---------------------------------------------
>                 Key: FLINK-1005
>                 URL: https://issues.apache.org/jira/browse/FLINK-1005
>             Project: Flink
>          Issue Type: Improvement
>          Components: Local Runtime
>    Affects Versions: 0.6-incubating
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 0.6-incubating
> Currently, the runtime works strictly with mutable objects. That means that as few objects
as possible (typically one or two) are reused for the data records all the time. Objects are
cloned/restored, though, at various places to ensure that the contents is fresh at every call.
> The rational behind this was to reduce pressure on the garbage collector. In fact, you
can run programs where no garbage collection happens (if the UDFs are written to reuse objects
as well).
> It can, however, lead to bugs in not-carefully written user code.
> I propose to add two modes to the runtime:
>   - No-object-reuse (default) mode. New objects for every record. Safe but potentially
>   - Object-reusing mode - All objects are reused, without backup copies.. The UDFs must
be careful to not keep any objects as state or not to modify the objects,

This message was sent by Atlassian JIRA

View raw message