kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: best practices to remove/retire data
Date Thu, 12 May 2016 17:32:41 GMT
It should be fully implemented for 1.0 which we're aiming for August. You
can follow this jira: https://issues.apache.org/jira/browse/KUDU-1306


On Thu, May 12, 2016 at 10:10 AM, Sand Stone <sand.m.stone@gmail.com> wrote:

> Thanks J-D.
> Any idea when the partition level deletion will be implemented?
> On Thu, May 12, 2016 at 8:24 AM, Jean-Daniel Cryans <jdcryans@apache.org>
> wrote:
>> Hi,
>> Right now this use case is more difficult than it needs to be. In your
>> previous thread, "Partition and Split rows", we talked about non-covering
>> range partition and this is something that would help your use case a lot.
>> Basically, you could create partitions that cover full days, and everyday
>> you could delete the old partitions while creating the next day's. Deleting
>> a partition is really quick and efficient compared to manually deleting
>> individual rows.
>> Until this is available I'd do this with multiple table, but it's a mess
>> to handle as you described.
>> Hope this helps,
>> J-D
>> On Thu, May 12, 2016 at 8:16 AM, Sand Stone <sand.m.stone@gmail.com>
>> wrote:
>>> Hi. Presumably I need to write a program to delete the unwanted rows,
>>> say, remove all data older than 3 days, while the table is still ingesting
>>> new data.
>>> How well will this perform for large tables? Both deletion and ingestion
>>> wise.
>>> Or for this specific case that I retire data by day, I should create a
>>> new table per day. However then the users have to be aware of the table
>>> naming scheme somehow. If a mention policy is changed. all the client side
>>> code might have to change (sure we can have one level of indirection to
>>> minimize the pain).
>>> Thanks.

View raw message