flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2871) Add OuterJoin strategy with HashTable on outer side
Date Wed, 20 Jan 2016 10:02:39 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108313#comment-15108313

ASF GitHub Bot commented on FLINK-2871:

Github user ChengXiangLi commented on the pull request:

    I did simple regression test based on `HashVsSortMiniBenchmark`, the result looks like:
    Test | Before | After
    ------ | ------ | --------
    testBuildFirst | 6.63s | 6.65s
    testBuildSecond | 3.7s | 3.8s
    The inner join performance is not influenced by this PR, which fit into my expectation.
There is a flag called `buildsideOuterJoin` in `MutableHashTable`, all the extra effort only
happens while `buildSideOuterJoin` is true.

> Add OuterJoin strategy with HashTable on outer side
> ---------------------------------------------------
>                 Key: FLINK-2871
>                 URL: https://issues.apache.org/jira/browse/FLINK-2871
>             Project: Flink
>          Issue Type: New Feature
>          Components: Local Runtime, Optimizer
>    Affects Versions: 0.10.0
>            Reporter: Fabian Hueske
>            Assignee: Chengxiang Li
>            Priority: Minor
> Outer joins are currently supported with two local execution strategies:
> - sort-merge join
> - hash join where the hash table is built on the inner side. Hence, this strategy is
only supported for left and right outer joins.
> In order to support hash-tables on the outer side, we need a special hash table implementation
that gives access to all records which have not been accessed during the probe phase.

This message was sent by Atlassian JIRA

View raw message