cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <>
Subject Re: Choosing a compaction strategy (TWCS)
Date Fri, 16 Dec 2016 18:47:42 GMT
With a 10 year retention, just ignore the target sstable count (I should remove that guidance,
to be honest), and go for a 1 week window to match your partition size. 520 sstables on disk
isn’t going to hurt you as long as you’re not reading from all of them, and with a partition-per-week
the bloom filter is going to make things nice and easy for you.


-          Jeff



From: Voytek Jarnot <>
Reply-To: "" <>
Date: Friday, December 16, 2016 at 10:37 AM
To: "" <>
Subject: Choosing a compaction strategy (TWCS)



Converting an Oracle table to Cassandra, one Oracle table to 4 Cassandra tables, basically
time-series - think log or auditing.  Retention is 10 years, but greater than 95% of reads
will occur on data written within the last year. 7 day TTL used on a small percentage of the
records, majority do not use TTL. Other than the aforementioned TTL, and the 10-year purge,
no updates or deletes are done.


Seems like TWCS is the right choice, but I have a few questions/concerns:


1) I'll be bulk loading a few years of existing data upon deployment - any issues with that?
 I assume using "with timestamp" when inserting this data will be mandatory if I choose TWCS?


2) I read here ( that "You should target fewer than 50
buckets per table based on your TTL." That's going to be a tough goal with a 10 year retention
... can anyone speak to how important this target really is?


3) If I'm bucketing my data with week/year (i.e., partition on year, week - so today would
be in 2016, 50), it seems like a natural fit for compaction_window_size would be 7 days, but
I'm thinking my calendar-based weeks will never align with TWCS 7-day-period weeks anyway
- am I missing something there?


I'd appreciate any other thoughts on compaction and/or twcs.



View raw message