hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException
Date Wed, 06 Jan 2016 16:34:39 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085780#comment-15085780
] 

Ashutosh Chauhan commented on HIVE-12664:
-----------------------------------------

+1

> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -------------------------------------------------------------------------
>
>                 Key: HIVE-12664
>                 URL: https://issues.apache.org/jira/browse/HIVE-12664
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.1.1, 1.2.1
>            Reporter: Johan Gustavsson
>            Assignee: Johan Gustavsson
>         Attachments: HIVE-12664-1.patch, HIVE-12664-2.patch, HIVE-12664.1.patch, HIVE-12664.2.patch,
HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child node for
join -and the check itself also contains a major bug- causing ArrayOutOfBoundException no
matter what.
> Sample data table form:
> ||time||user||host||path||referer||code||agent||size||method||
> |int|string|string|string|string|bigint|string|bigint|string|
> Sample query
> {code:sql}
> SELECT 
>   t1.host,
>   COUNT(DISTINCT t1.`date`) AS login_count,
>   MAX(t2.code) AS code,
>   unix_timestamp() AS time
> FROM (
>     SELECT 
>       HOST,
>       MIN(time) AS DATE
>     FROM
>       www_access
>     WHERE
>       HOST IS NOT NULL
>     GROUP BY
>       HOST
>   ) t1
> JOIN (
>     SELECT 
>       HOST,
>       MIN(time) AS code
>     FROM
>       www_access
>     WHERE
>       HOST IS NOT NULL
>     GROUP BY
>       HOST
>   ) t2
>   ON t1.host = t2.host
> GROUP BY
>   t1.host
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message