hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Db-Blog <mpp.databa...@gmail.com>
Subject Re: Bucketing- Identify Number of Buckets
Date Sun, 06 Sep 2015 20:22:21 GMT
Details of Hive Version:
I am using Hive -14.0 with Tez as execution engine. 

Thanks,
Saurabh

Sent from my iPhone, please avoid typos.

> On 07-Sep-2015, at 1:51 am, Db-Blog <mpp.databases@gmail.com> wrote:
> 
> Hi, 
> 
> I need to join two big tables in hive. The join key is the grain of both these tables,
hence clustering and sorting on the same will provide significant performance optimisation
while joining.  
> 
> However, i am not sure how to calculate the exact number of buckets while creating these
tables. Can someone please share any pointers on the same? 
> 
> Planning to keep these Clustered and Sorted tables as parquet/orc- for columnar storage
and better compression. 
> 
> Thanks,
> Saurabh
> 
> Sent from my iPhone, please avoid typos.

Mime
View raw message