hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abdelrhman Shettia <ashet...@hortonworks.com>
Subject Re: A bug of auto convert join with intermediate table?
Date Wed, 06 Feb 2013 20:40:10 GMT
Hi Zhong, 

It is possible that you are facing the following hive bug? You may want to upgrade the current
hive client.  


https://issues.apache.org/jira/browse/HIVE-2095


Thanks
-Abdelrhman 


Hortonworks, Inc.
Technical Support Engineer
Abdelrahman Shettia
ashettia@hortonworks.com
Office phone: (708) 689-9609
How am I doing?   Please feel free to provide feedback to my manager Rick Morris at rick@hortonworks.com


On Feb 6, 2013, at 5:28 AM, Zhong Wang <wangzhong.neu@gmail.com> wrote:

> Hi all,
> 
> I am running tests on Hive auto convert join. From the source code, it seems the conditional
task will consider the intermediate table size and run the local task for generating hashtable
on the intermediate table if it is smaller than the threshold of hive.mapjoin.smalltable.filesize.
However, I ran a very simple query based on TPC-H:
> 
> set hive.auto.convert.join=true;
> 
> insert overwrite table q3_tmp
> select c_custkey, o_orderkey, o_orderdate
> from orders o join customer c on c.c_mktsegment = 'BUILDING' and
> c.c_custkey = o.o_custkey
> join lineitem l on l.l_orderkey = o.o_orderkey
> where c.c_custkey < 1000;
> 
> The intermediate table of c join o is very small (50KB), which is much less than the
threshold. However, both the map joins of the intermediate table and lineitem are filtered
by conditional task. Is this a bug of auto convert join or something wrong with my usage/analysis?
> 
> Zhong


Mime
View raw message