hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Viral Bajaria <viral.baja...@gmail.com>
Subject partitioned column join does not work as expected
Date Tue, 18 Jan 2011 10:04:54 GMT
I am facing issues with a query where I am joining two fairly large tables
on the partitioned column along with other common columns. The expected
output is not in line with what I expect it to be. Since the query is very
complex, I will simplify it so that people can provide inputs if they have
faced similar issues or if I am doing something totally wrong.

TABLE A:
a_id bigint
common_id bigint
some_string string
total_count bigint
part_col string <---- this is the partitioned column

TABLE B:
b_int bigint
common_id bigint
some_string string
total_sum bigint
part_col string <---- this is the partitioned column

now the query is as follows:
SELECT /*+ STREAMTABLE(A,B) */ A.some_string, B.some_string,
sum(A.total_count), sum(B.total_sum) from A JOIN B ON (t1.part_col =
t2.part_col AND t1.common_id = t2.common_id) WHERE t1.part_col >= 'val1' AND
t2.part_col >= 'val1' GROUP BY A.some_string, B.some_string

Does HIVE not like to join on the partitioned columns ? because when i
create a join on just the partitioned column the reduce step never finishes.

I am using HIVE 0.5.0

Thanks,
Viral

Mime
View raw message