hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain Petrus <alain.petru...@gmail.com>
Subject Re: Handling updates to Bucketed Table
Date Thu, 18 Sep 2014 19:39:41 GMT
Hi all,

Very interesting question.  In my case, I have date partition and I am using 8 buckets that
are sorted on id.
I am wondering what when adding new data to this table.  Data will be put in the correct partition,
but will it be bucketed?

Thanks for your help,
Alain


On 18 Sep 2014, at 20:02, Kumar V <kumarbuyonline@yahoo.com> wrote:

> Thx Nitin. I just wanted to confirm before I give up. I'll probably do a daily partition
and see how it goes.
> 
> Thanks.
> 
> 
> On Thursday, September 18, 2014 12:30 PM, Nitin Pawar <nitinpawar432@gmail.com>
wrote:
> 
> 
> When you bucket the data in a partition, 
> there will be a file created for each of your bucketing key. 
> 
> Now if you add more data to the same bucket that means that file would need to rebuild

> 
> I would prefer a partition on day level under month level where I write the data once
a day and bucket the data there 
> 
> 
> I am not sure hive supports append to bucketed files yet. 
> please wait for others to answer as well 
> 
> On Thu, Sep 18, 2014 at 9:27 PM, Kumar V <kumarbuyonline@yahoo.com> wrote:
> Hi,
>     I would like to know how to handle frequent updates to bucketed tables.  Is there
a way to update without a rebuild ?
> I have a monthly partition for a table with buckets.  But I have to update the table
every day.  Is there a way to achieve this without a rebuild of this partition every day ?
 Or, is this a wrong use case for a bucketed table ?
> This table is joined with another table.  So, I thought bucketing will speed up the queries.
 What are my options ?
> 
> Please let me know.
> 
> Regards,
> Murali.
> 
> 
> 
> 
> -- 
> Nitin Pawar
> 
> 


Mime
View raw message