kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sand Stone <sand.m.st...@gmail.com>
Subject Re: best practices to remove/retire data
Date Thu, 12 May 2016 17:10:43 GMT
Thanks J-D.

Any idea when the partition level deletion will be implemented?

On Thu, May 12, 2016 at 8:24 AM, Jean-Daniel Cryans <jdcryans@apache.org>

> Hi,
> Right now this use case is more difficult than it needs to be. In your
> previous thread, "Partition and Split rows", we talked about non-covering
> range partition and this is something that would help your use case a lot.
> Basically, you could create partitions that cover full days, and everyday
> you could delete the old partitions while creating the next day's. Deleting
> a partition is really quick and efficient compared to manually deleting
> individual rows.
> Until this is available I'd do this with multiple table, but it's a mess
> to handle as you described.
> Hope this helps,
> J-D
> On Thu, May 12, 2016 at 8:16 AM, Sand Stone <sand.m.stone@gmail.com>
> wrote:
>> Hi. Presumably I need to write a program to delete the unwanted rows,
>> say, remove all data older than 3 days, while the table is still ingesting
>> new data.
>> How well will this perform for large tables? Both deletion and ingestion
>> wise.
>> Or for this specific case that I retire data by day, I should create a
>> new table per day. However then the users have to be aware of the table
>> naming scheme somehow. If a mention policy is changed. all the client side
>> code might have to change (sure we can have one level of indirection to
>> minimize the pain).
>> Thanks.

View raw message