flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jaromir Vanek (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-6394) GroupCombine reuses instances even though object reuse is disabled
Date Thu, 27 Apr 2017 11:46:05 GMT
Jaromir Vanek created FLINK-6394:

             Summary: GroupCombine reuses instances even though object reuse is disabled
                 Key: FLINK-6394
                 URL: https://issues.apache.org/jira/browse/FLINK-6394
             Project: Flink
          Issue Type: Bug
    Affects Versions: 1.2.0
            Reporter: Jaromir Vanek
            Priority: Critical

I am using group combiner in DataSet API with disabled object reuse.

In code it may be expressed as follows: 

         .combineGroup((it, collector) -> {
            // store first item for future use
           Pojo stored = it.next();
           while (it.hasNext()) {

It seems even the object reuse feature is disabled, my instance is actually replaced when
{{.next()}} is called on the iterator. It leads to very confusing and wrong results.

I checked the Flink codebase and it seems {{CombiningUnilateralSortMerger}} is actually reusing
object instances even though object reuse is explicitly disabled.
In spilling phase user's combiner is called with instance of {{CombineValueIterator}} that
actually reuses instances without any warning.

See https://github.com/apache/flink/blob/d7b59d761601baba6765bb4fc407bcd9fd6a9387/flink-runtime/src/main/java/org/apache/flink/runtime/operators/sort/CombiningUnilateralSortMerger.java#L550

When I disable combiner and use {{groupReduce}} only with the same reduce function results
are fine.

Please let me know if you can confirm this as a bug. From my point it's very critical as I
am getting unpredictable results.

This message was sent by Atlassian JIRA

View raw message