hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bing Jiang <jiangbinglo...@gmail.com>
Subject Re: Caching intermediate data in tez object registry
Date Wed, 02 Dec 2015 03:42:26 GMT
hi, Raajay.

https://issues.apache.org/jira/browse/HIVE-7313 provides a potential
solutions to store intermediate data into Memory/SSD. But it relies on the
hdfs feature of multiple StorageType (
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
)

2015-12-02 5:17 GMT+08:00 Raajay <raajay.v@gmail.com>:

> Hello,
>
> My setup is Hive on Tez.  I find that for most of my queries, the map
> stage takes the longest. Is it possible to use the Tez Shared Object
> Registry to cache the intermediate data to improve performance of recurring
> queries ?
>
> If yes, how would I do it ? Assuming that the nodes I run on have
> sufficient RAM to store all intermediate data.
>
> Raajay
>



-- 
Bing Jiang

Mime
View raw message