From Ross Guth
Subject Re: Hash table in map join - Hive
Date Tue, 28 Jun 2016
Hi Gopal,

Thanks a lot for the answers. They were helpful.
I have a few more questions regarding this:

1. OOM condition -- I get the following error when I force a map join in
hive/tez with low container size and heap size:"
java.lang.OutOfMemoryError: Java heap space". I was wondering what is the
condition which leads to this error. Is it when, even one of the 16
partitions of hashtable, cannot fit in memory? I tried setting
hive.mapjoin.hybridgrace.minnumpartitions to higher values like 50 and 100
(expecting the size of each partition to drop drastically  [the join key
doesn't have skew in the distribution] ). But I still get the OOM error.
What is the cause?

2.  Shuffle Hash Join -- I am using hive 2.0.1. What is the way to force
this join implementation? Is there any documentation regarding the same?

3. Hash table size: I use "--hiveconf hive.root.logger=INFO,console" for
seeing logs. I do not see the hash table size in these logs. I tried using:
 "--hiveconf hive.tez.exec.print.summary=true". The output of this was the

METHOD                         DURATION(ms)
parse                                  977
semanticAnalyze                      2,435
TezBuildDag                            473
TezSubmitToRunningDag                  841
TotalPrepTime                       10,451

Map 1                      3                0            0            37.11
            65,710              1,039     15,000,000       15,000,000
Map 2                     65                0            0           498.77
         4,163,920            100,613    615,037,902                0
This doesn't seem to have information about the hash table or #items
shuffled. Am I missing something ?


On Mon, Jun 27, 2016 at 9:10 AM, Gopal Vijayaraghavan <>

> > 1. Is there a way to check the size of the hash table created during map
> >side join in Hive/Tez?
> Only from the log files.
> However, you enable hive.tez.exec.print.summary=true; then the hive CLI
> will print out the total # of items shuffle from the broadcast edges
> feeding the hashtable.
> Not sure if you might have the reduce-side map-join in your builds, but
> that is a bit harder to look into.
> > 2. Is the hash table (small table's), created for the entire table or
> >only for the selected and join key columns?
> Yup. Project, filter and group-by (in case of semi-joins).
> select a from tab1 where a IN (select b from tab2 ...);
> will inject a "select distinct b" into the plan.
> > 3. The hash table (created in map side join) spills to disk, if it does
> >not fit in memory Is there a parameter in hive/tez to specify the
> >percentage of the hash file which can spill?
> Not directly.
> hive.mapjoin.hybridgrace.minnumpartitions=16
> by default. So 1/16th of your key space will spill, whenever it hits the
> spilling conditions - for the small table.
> In general, the Snowflake-model dimension tables are joined by their
> primary key, so the key-space corresponds to the row-distribution too.
> For the big table it will spill only a smaller fraction since the
> BloomFilter built during hashtable generation is not spilled, so anything
> which misses the bloom filter will not spill to disk during the join.
> All that said, the spilling hash-join is much slower than the shuffling
> hash-join (new in 2.0), because the grace hash-join drops parts of the
> hash out of memory after each iteration & has to rebuild it for each split
> processed.
> In terms of total CPU, building a 4 million row hash table in 600 tasks is
> slower than building a 7500 row hashtable x 600 times - the hashtable
> lookup goes up by LG(N) too.
> Ask me more questions, if you need more info.
> Cheer,
> Gopal

