hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-16017) MM tables - many queries duplicate the data after master merge
Date Fri, 24 Feb 2017 01:09:44 GMT

     [ https://issues.apache.org/jira/browse/HIVE-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergey Shelukhin updated HIVE-16017:
------------------------------------
    Fix Version/s: hive-14535

> MM tables - many queries duplicate the data after master merge
> --------------------------------------------------------------
>
>                 Key: HIVE-16017
>                 URL: https://issues.apache.org/jira/browse/HIVE-16017
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>             Fix For: hive-14535
>
>
> Update: happens on many more queries it looks like, and started happening after a recent
master merge after I wasn't working on the feature for a while
> This duplicates the data (given that the original query is a self-union, essentially
outputs it 4 times instead of 2) for either MM or non-MM tables, on MM branch.
> It seems to be adding correct inputs (esp. in non-MM case the inputs are the same as
before). Presumably something in the output changes in the branch is broken for this case.
Not sure what yet. 
> {noformat}
> CREATE TABLE tbl1_mm(key int, value string) CLUSTERED BY (key) SORTED BY (key) INTO 2
BUCKETS;
> insert overwrite table tbl1_mm select * from src where key < 10;
> select key, value from tbl1_mm a where key < 6
> union all
> select key, value from tbl1_mm a where key < 6;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message