flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2105) Implement Sort-Merge Outer Join algorithm
Date Thu, 23 Jul 2015 23:47:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639681#comment-14639681

ASF GitHub Bot commented on FLINK-2105:

Github user fhueske commented on the pull request:

    Very nice work! :+1: I had just a few minor style regards.
    It would be nice if you could check the overhead of having all outer join cases handled
by the same class vs. having a dedicated class for LEFT, RIGHT, and FULL OUTER. 
    I would also like to run a major regression benchmark to double-check that the refactoring
of the current sort-merge join didn't cause a performance regression. I can take care of that
once you addressed the comments.
    If all lights are green after that, I would be happy to merge this PR.

> Implement Sort-Merge Outer Join algorithm
> -----------------------------------------
>                 Key: FLINK-2105
>                 URL: https://issues.apache.org/jira/browse/FLINK-2105
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Local Runtime
>            Reporter: Fabian Hueske
>            Assignee: Ricky Pogalz
>            Priority: Minor
>             Fix For: pre-apache
> Flink does not natively support outer joins at the moment. 
> This issue proposes to implement a sort-merge outer join algorithm that can cover left,
right, and full outer joins.
> The implementation can be based on the regular sort-merge join iterator ({{ReusingMergeMatchIterator}}
and {{NonReusingMergeMatchIterator}}, see also {{MatchDriver}} class)
> The Reusing and NonReusing variants differ in whether object instances are reused or
new objects are created. I would start with the NonReusing variant which is safer from a user's
point of view and should also be easier to implement.

This message was sent by Atlassian JIRA

View raw message