hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-17675) verify SMB join with multiple inserts
Date Mon, 02 Oct 2017 23:26:00 GMT
Sergey Shelukhin created HIVE-17675:
---------------------------------------

             Summary: verify SMB join with multiple inserts
                 Key: HIVE-17675
                 URL: https://issues.apache.org/jira/browse/HIVE-17675
             Project: Hive
          Issue Type: Bug
            Reporter: Sergey Shelukhin


Hive has a family of joins that interact with sorted and bucketed tables. Afaik one (all?)
of them actually rely on the table being sorted, rather than sorting it. 
If one runs insert on such a table without merge more than once, there'd be 2+ files for every
bucket that are individually sorted; but globally, the table would no longer be sorted.
Would these joins work/disable themselves correctly in this case, or could it produce incorrect
results? We might need a q file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message