flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1628) Strange behavior of "where" function during a join
Date Fri, 06 Mar 2015 00:01:38 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349652#comment-14349652

ASF GitHub Bot commented on FLINK-1628:

GitHub user fhueske opened a pull request:


    [FLINK-1628] Fix partitioning properties for Joins and CoGroups.

    Fix partitioning properties for Joins and CoGroups and some smaller bugs on the way.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fhueske/flink joinCompilerBug

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #458
commit 89c3bd0b76c1b4ace58e93571b361cdc0af2cbd6
Author: Fabian Hueske <fhueske@apache.org>
Date:   2015-03-04T17:49:22Z

    [FLINK-1628] Fix partitioning properties for Joins and CoGroups.


> Strange behavior of "where" function during a join
> --------------------------------------------------
>                 Key: FLINK-1628
>                 URL: https://issues.apache.org/jira/browse/FLINK-1628
>             Project: Flink
>          Issue Type: Bug
>          Components: Optimizer
>    Affects Versions: 0.9
>            Reporter: Daniel Bali
>            Assignee: Fabian Hueske
>            Priority: Critical
>              Labels: batch
> Hello!
> If I use the `where` function with a field list during a join, it exhibits strange behavior.
> Here is the sample code that triggers the error: https://gist.github.com/balidani/d9789b713e559d867d5c
> This example joins a DataSet with itself, then counts the number of rows. If I use `.where(0,
1)` the result is (22), which is not correct. If I use `EdgeKeySelector`, I get the correct
result (101).
> When I pass a field list to the `equalTo` function (but not `where`), everything works
> If I don't include the `groupBy` and `reduceGroup` parts, everything works.
> Also, when working with large DataSets, passing a field list to `where` makes it incredibly
slow, even though I don't see any exceptions in the log (in DEBUG mode).
> Does anybody know what might cause this problem?
> Thanks!

This message was sent by Atlassian JIRA

View raw message