hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Harris <>
Subject RE: Issue joining 21 HUGE Hive tables
Date Thu, 24 Mar 2016 05:24:40 GMT
the query that you are using would have to be analyzed to know how much it could be optimized.
The small tables should be able to be handled with a map-join, depending on hive version,
that may be happening automatically.
Hive will be doing the joins in stages.
You could manually implement the stages to assist in optimization and troubleshooting....once
you know how long each join stage is taking, you can figure out where things are getting out
of hand.
Depending on the data, you might be able to partition it or bucket it, to help with join optimization.
And ultimately, depending on the size and complexity of the query compared to the size/capacity
of your cluster, it could take hours for the query to the tasks finish?  If you
have one or two long running tasks, while everything else has completed, but if the job is
chugging through the stages and tasks aren't failing, you may just need more resources.

From: Sanka, Himabindu []
Sent: Wednesday, March 23, 2016 7:50 PM
Subject: Issue joining 21 HUGE Hive tables

Hi Team,

I need some inputs from you. I have a requirement for my project where I have to join 21 hive
external tables.

Out of which 6 tables are HUGE  having 500 million records of data. Other 15 tables are smaller
ones around 100 to 1000 records each.

When I am doing inner joins/ left outer joins its taking hours to run the query.

Please let me know some optimization techniques or any other eco system components that performs
better than HIVE.


This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.

information that is privileged and exempt from disclosure under applicable law. If you are
neither the intended recipient nor responsible for delivering the message to the intended
recipient, please note that any dissemination, distribution, copying or the taking of any
action in reliance upon the message is strictly prohibited. If you have received this communication
in error, please notify the sender immediately.  Thank you.

View raw message