hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yin Huai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4689) For outerjoins, joinEmitInterval might make wrong result
Date Tue, 02 Jul 2013 03:13:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13697455#comment-13697455
] 

Yin Huai commented on HIVE-4689:
--------------------------------

I agree with Navis. Here is my understanding. Right now, if we do not have alias-filter, a
non-matching row will be generated only when either left or right table does not have a matching
row (assuming we have a 2-way join). In this case, either the container of the left table
or that of the right table will be empty, so joinEmitInterval will not affect the correctness
of results.
                
> For outerjoins, joinEmitInterval might make wrong result
> --------------------------------------------------------
>
>                 Key: HIVE-4689
>                 URL: https://issues.apache.org/jira/browse/HIVE-4689
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.11.0
>            Reporter: Navis
>            Assignee: Navis
>         Attachments: HIVE-4689.D11211.1.patch
>
>
> Alias filter tag is calculated for each group and used for outer joins. But if joinEmitInterval
is smaller than the group size, pre-matured alias filter tag would be used and might introduce
different(wrong) result.
> It can be observed in join_1to1.q test but I cannot imagine proper solution which does
not override intention of joinEmitInterval. Should it be disabled for outer joins?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message