hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajo Fod <ajo....@gmail.com>
Subject Re: partitioned column join does not work as expected
Date Wed, 19 Jan 2011 04:40:18 GMT
Can you try this with a dummy table with very few rows ... to see if
the reason the script doesn't finish is a computational issue?

One other thing is to try with a combined partition, to see if it is a
problem with the partitioning.

Also, take a look at  the results of an EXPLAIN statement, see if
there are any hints there.

NOTE: I'm new to hive too.

-Ajo


On Tue, Jan 18, 2011 at 8:08 PM, Viral Bajaria <viral.bajaria@gmail.com> wrote:
> I haven't heard back from any on the list and am still struggling to join
> two tables on partitioned column
>
> Has anyone every tried joining two tables on a paritioned column and the
> results are not as expected ?
> On Tue, Jan 18, 2011 at 2:04 AM, Viral Bajaria <viral.bajaria@gmail.com>
> wrote:
>>
>> I am facing issues with a query where I am joining two fairly large tables
>> on the partitioned column along with other common columns. The expected
>> output is not in line with what I expect it to be. Since the query is very
>> complex, I will simplify it so that people can provide inputs if they have
>> faced similar issues or if I am doing something totally wrong.
>> TABLE A:
>> a_id bigint
>> common_id bigint
>> some_string string
>> total_count bigint
>> part_col string <---- this is the partitioned column
>> TABLE B:
>> b_int bigint
>> common_id bigint
>> some_string string
>> total_sum bigint
>> part_col string <---- this is the partitioned column
>> now the query is as follows:
>> SELECT /*+ STREAMTABLE(A,B) */ A.some_string, B.some_string,
>> sum(A.total_count), sum(B.total_sum) from A JOIN B ON (t1.part_col =
>> t2.part_col AND t1.common_id = t2.common_id) WHERE t1.part_col >= 'val1' AND
>> t2.part_col >= 'val1' GROUP BY A.some_string, B.some_string
>> Does HIVE not like to join on the partitioned columns ? because when i
>> create a join on just the partitioned column the reduce step never finishes.
>> I am using HIVE 0.5.0
>> Thanks,
>> Viral
>

Mime
View raw message