drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mehant Baid (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1887) Wrong result for hash join on null valued columns
Date Fri, 19 Dec 2014 13:19:13 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253370#comment-14253370
] 

Mehant Baid commented on DRILL-1887:
------------------------------------

In ChainedHashTable the condition being checked is (!areNullsEqual && isLeftNull &&
isRightNull), if this condition is satisfied then we return false. However if nulls aren't
equal then if one of the two sides is Null we should be able to return false so can't this
condition be 
(!areNullsEqual && (isLeftNull || isRightNull)). In this case we won't go through
the comparator at all?

Otherwise looks good, +1.

> Wrong result for hash join on null valued columns
> -------------------------------------------------
>
>                 Key: DRILL-1887
>                 URL: https://issues.apache.org/jira/browse/DRILL-1887
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Rahul Challapalli
>            Assignee: Aman Sinha
>         Attachments: 0001-DRILL-1887-Add-code-gen-for-explicitly-comparing-nul.patch
>
>
> git.commit.id.abbrev=9dfa4a1
> Dataset1:
> {code}
> {
>  "col1":1,
>  "col2":"abc"
> }
> {
>  "col1":2,
>  "col2":null
> }
> {code}
> Dataset 2:
> {code}
> {
>  "col1":1,
>  "col2":null
> }
> {
>  "col1":2,
>  "col2":null
> }
> {
>  "col1":2,
>  "col2":"abc"
> }
> {code}
> Query :
> {code}
> select * from `a.json` a join `b.json` b on a.col2=b.col2; 
> +------------+------------+------------+------------+
> |    col1    |    col2    |   col10    |   col20    |
> +------------+------------+------------+------------+
> | 1          | abc        | 2          | abc        |
> | 2          | null       | 1          | null       |
> | 2          | null       | 2          | null       |
> +------------+------------+------------+------------+
> {code}
> Most database engines treat 2 NULL values as not being equal. This makes it hard to compare
DRILL's results against any Standard SQL engine. However I am not sure whether drill is intentionally
designed to behave this way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message