hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sukhendu Chakraborty <sukhendu.chakrabo...@gmail.com>
Subject HIVE optimizer enhancements in 0.9.0+ releases
Date Thu, 15 Nov 2012 20:37:49 GMT
Hi,

I am a HIVE user who is working on anlytical applications on large
data sets. For us, the HIVE performance is critical for the success of
our product. I was wondering if there are any recent improvements that
were made in the optimizer layer.  One of the relevant references I
found on the web is the HIVE paper
(http://infolab.stanford.edu/~ragho/hive-icde2010.pdf) . If you can
send me any pointers on current enhancements, that would be great.

Some specific improvements I am looking for are:
1. Cost based optimization (logical or physical)
2. "multi-query optimization techniques and performing generic n-way
joins in a single map-reduce job" (quoted from the future work section
of the paper above)
3. Using and generation of table statistics for generation of
betterplans/faster execution etc. I know there was some code added to
generate column statistics for HIVE tables. Any table level statistics
generation?

Thanks for your help,
-Sukhendu

Mime
View raw message