flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7755) Null values are not correctly handled by batch inner and outer joins
Date Fri, 20 Oct 2017 22:04:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213322#comment-16213322

ASF GitHub Bot commented on FLINK-7755:

Github user fhueske commented on a diff in the pull request:

    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/dataSet/DataSetJoinRule.scala
    @@ -41,8 +41,7 @@ class DataSetJoinRule
         val joinInfo = join.analyzeCondition
         // joins require an equi-condition or a conjunctive predicate with at least one equi-condition
    -    // and disable outer joins with non-equality predicates(see FLINK-5520)
    -    !joinInfo.pairs().isEmpty && (joinInfo.isEqui || join.getJoinType == JoinRelType.INNER)
    +    !joinInfo.pairs().isEmpty
    --- End diff --
    Yes, that is true but the rule are also applied in different contexts. `FlinkLogicalJoin`
is used for the initial translation of batch and stream programs and `DataSetJoinRule` only
for batch. I think it's OK to have these checks as safety net.

> Null values are not correctly handled by batch inner and outer joins
> --------------------------------------------------------------------
>                 Key: FLINK-7755
>                 URL: https://issues.apache.org/jira/browse/FLINK-7755
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>    Affects Versions: 1.4.0, 1.3.2
>            Reporter: Fabian Hueske
>            Assignee: Fabian Hueske
>            Priority: Blocker
>             Fix For: 1.4.0, 1.3.3
> Join predicates of batch joins are not correctly evaluated according to three-value logic.
> This affects inner as well as outer joins.
> The problem is that some equality predicates are only evaluated by the internal join
algorithms of Flink which are based on {{TypeComparator}}. The field {{TypeComparator}} for
{{Row}} are implemented such that {{null == null}} results in {{TRUE}} to ensure correct ordering
and grouping. However, three-value logic requires that {{null == null}} results to {{UNKNOWN}}
(or null). The code generator implements this logic correctly, but for equality predicates,
no code is generated.
> For outer joins, the problem is a bit tricker because these do not support code-generated
predicates yet (see FLINK-5520). FLINK-5498 proposes a solution for this issue.
> We also need to extend several of the existing tests and add null values to ensure that
the join logic is correctly implemented. 

This message was sent by Atlassian JIRA

View raw message