hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhong Wang <>
Subject A bug of auto convert join with intermediate table?
Date Wed, 06 Feb 2013 13:28:30 GMT
Hi all,

I am running tests on Hive auto convert join. From the source code, it
seems the conditional task will consider the intermediate table size and
run the local task for generating hashtable on the intermediate table if it
is smaller than the threshold of hive.mapjoin.smalltable.filesize. However,
I ran a very simple query based on TPC-H:


insert overwrite table q3_tmp
select c_custkey, o_orderkey, o_orderdate
from orders o join customer c on c.c_mktsegment = 'BUILDING' and
c.c_custkey = o.o_custkey
join lineitem l on l.l_orderkey = o.o_orderkey
where c.c_custkey < 1000;

The intermediate table of c join o is very small (50KB), which is much less
than the threshold. However, both the map joins of the intermediate table
and lineitem are filtered by conditional task. Is this a bug of auto
convert join or something wrong with my usage/analysis?


View raw message