hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <alanfga...@gmail.com>
Subject Re: Using Spark on Hive with Hive also using Spark as its execution engine
Date Tue, 12 Jul 2016 14:39:01 GMT

> On Jul 11, 2016, at 16:22, Mich Talebzadeh <mich.talebzadeh@gmail.com> wrote:
> 
> <snip>
> 	• If I add LLAP, will that be more efficient in terms of memory usage compared to
Hive or not? Will it keep the data in memory for reuse or not.
> 	
Yes, this is exactly what LLAP does.  It keeps a cache of hot data (hot columns of hot partitions)
and shares that across queries.  Unlike many MPP caches it will cache the same data on multiple
nodes if it has more workers that want to access the data than can be run on a single node.

As a side note, it is considered bad form in Apache to send a message to two lists.  It causes
a lot of background noise for people on the Spark list who probably aren’t interested in
Hive performance.

Alan.



Mime
View raw message