hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4845) Correctness issue with MapJoins using the null safe operator
Date Tue, 16 Jul 2013 02:56:49 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13709399#comment-13709399
] 

Ashutosh Chauhan commented on HIVE-4845:
----------------------------------------

I meant your previous two patches on this same jira. (the second one looked like is on top
of the first one, instead of including it). But, now that you have regenerated patch, the
new one supercedes earlier two.
                
> Correctness issue with MapJoins using the null safe operator
> ------------------------------------------------------------
>
>                 Key: HIVE-4845
>                 URL: https://issues.apache.org/jira/browse/HIVE-4845
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Brock Noland
>            Assignee: Brock Noland
>            Priority: Critical
>         Attachments: HIVE-4845.patch, HIVE-4845.patch, HIVE-4845.patch
>
>
> I found a correctness issue while working on HIVE-4838. The following query from join_nullsafe.q
gives different results depending on if it's executed map-side or reduce-side:
> {noformat}
> SELECT /*+ MAPJOIN(a) */ * FROM smb_input1 a JOIN smb_input1 b ON a.key <=> b.key
AND a.value <=> b.value ORDER BY a.key, a.value, b.key, b.value;
> {noformat}
> For that query, on the map side, rows which should be joined are not. For example, the
reduce side outputs this row:
> {noformat}
> a.key   a.value   b.key   b.value
> 148     NULL      148     NULL
> {noformat}
> which makes sense since a.key is equal to b.key and a.value is equal to b.value but the
current map-side code omits this row. The reason is that MapJoinDoubleKey is used for the
map-side join which doesn't properly compare null values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message