CQL lets you specify a default TTL per column family/table: and default_time_to_live=86400 .From: Redmumba [mailto:email@example.com]
Sent: Monday, April 28, 2014 12:51 PM
Subject: Re: Cassandra data retention policy
Have you looked into using a TTL? You can set this per insert (unfortunately, it can't be set per CF) and values will be tombstoned after that amount of time. I.e.,
INSERT INTO .... VALUES ... TTL 15552000
Keep in mind, after the values have expired, they will essentially become tombstones--so you will still need to run clean-ups (probably daily) to clear up space.
Does this help?One caveat is that this is difficult to apply to existing rows--i.e., you can't bulk-update a bunch of rows with this data. As such, another good suggestion is to simply have a secondary index on a date field of some kind, and run a bulk remove (and subsequent clean-up) daily/weekly/whatever.
On Mon, Apr 28, 2014 at 11:31 AM, Han Jia <firstname.lastname@example.org> wrote:Hi guys,
We have a processing system that just uses the data for the past six months in Cassandra. Any suggestions on the best way to manage the old data in order to save disk space? We want to keep it as backup but it will not be used unless we need to do recovery. Thanks in advance!