hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1194) sorted merge join
Date Wed, 03 Mar 2010 19:28:27 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840825#action_12840825
] 

He Yongqiang commented on HIVE-1194:
------------------------------------

@namit, are u looking at patch hive-1194-2010-3-2.2.patch?

For the last two queries you mentioned above,

 select /+mapjoin(a,b)/ * from smb_bucket_1 a right outer join smb_bucket_2 b on a.key \
= b.key join smb_bucket_3 c on b.key=c.key

and

select /+mapjoin(a,b)/ * from smb_bucket_1 a right outer join smb_bucket_2 b on a.key \
= b.key right outer join smb_bucket_3 c on b.key=c.key


The results look good to me.
Results:

NULL	NULL	20	val_20	20	val_20
NULL	NULL	23	val_23	23	val_23

and

NULL	NULL	NULL	NULL	4	val_4
NULL	NULL	NULL	NULL	10	val_10
NULL	NULL	NULL	NULL	17	val_17
NULL	NULL	NULL	NULL	19	val_19
NULL	NULL	20	val_20	20	val_20
NULL	NULL	23	val_23	23	val_23


Will check oracle and mysql about the semantics of the first two queries you commented.

> sorted merge join
> -----------------
>
>                 Key: HIVE-1194
>                 URL: https://issues.apache.org/jira/browse/HIVE-1194
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>             Fix For: 0.6.0
>
>         Attachments: hive-1194-2010-02-28.patch, hive-1194-2010-3-2.2.patch, hive-1194-2010-3-2.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being performed, it
is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed map joins
also.
> Since, sorted properties of a table are not enforced currently, a new parameter can be
added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message