flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Bali (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-1628) Strange behavior of "where" function during a join
Date Mon, 02 Mar 2015 17:53:04 GMT
Daniel Bali created FLINK-1628:

             Summary: Strange behavior of "where" function during a join
                 Key: FLINK-1628
                 URL: https://issues.apache.org/jira/browse/FLINK-1628
             Project: Flink
          Issue Type: Bug
    Affects Versions: 0.9
            Reporter: Daniel Bali


If I use the `where` function with a field list during a join, it exhibits strange behavior.

Here is the sample code that triggers the error: https://gist.github.com/balidani/d9789b713e559d867d5c

This example joins a DataSet with itself, then counts the number of rows. If I use `.where(0,
1)` the result is (22), which is not correct. If I use `EdgeKeySelector`, I get the correct
result (101).

When I pass a field list to the `equalTo` function (but not `where`), everything works again.

If I don't include the `groupBy` and `reduceGroup` parts, everything works.

Also, when working with large DataSets, passing a field list to `where` makes it incredibly
slow, even though I don't see any exceptions in the log (in DEBUG mode).

Does anybody know what might cause this problem?


This message was sent by Atlassian JIRA

View raw message