hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-4241) optimize hive.enforce.sorting and hive.enforce bucketing join
Date Thu, 28 Mar 2013 04:47:15 GMT
Namit Jain created HIVE-4241:
--------------------------------

             Summary: optimize hive.enforce.sorting and hive.enforce bucketing join
                 Key: HIVE-4241
                 URL: https://issues.apache.org/jira/browse/HIVE-4241
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain
            Assignee: Namit Jain


Consider the following scenario:


T1: sorted and bucketed by key into 2 buckets
T2: sorted and bucketed by key into 2 buckets
T3: sorted and bucketed by key into 2 buckets

set hive.enforce.sorting=true;
set hive.enforce.bucketing=true;
insert overwrite table T3
select .. from T1 join T2 on T1.key = T2.key;

Since T1, T2 and T3 are sorted/bucketed by the join, and the above join is
being performed as a sort-merge join, T3 should be bucketed/sorted without
the need for an extra reducer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message