cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: size tiered compaction - improvement
Date Tue, 03 Apr 2012 22:53:34 GMT
Twitter tried a timestamp-based compaction strategy in
https://issues.apache.org/jira/browse/CASSANDRA-2735.  The conclusion
was, "this actually resulted in a lot more compactions than the
SizeTieredCompactionStrategy. The increase in IO was not acceptable
for our use and therefore stopped working on this patch."

2012/4/3 Radim Kolar <hsn@filez.com>:
> there is problem with size tiered compaction design. It compacts together
> tables of similar size.
>
> sometimes it might happen that you will have some sstables sitting on disk
> forever (Feb 23) because no other similar sized tables were created and
> probably never be. because flushed sstable is about 11-16 mb.
>
> next level about 90 MB
> then 5x 90 MB gets compacted to 400 MB sstable
> and 5x400 MB ~ 2 GB
>
> problem is that 400 MB sstable is too small to be compacted against these 3x
> 720 MB ones.
>
> -rw-r--r--  1 root  wheel   165M Feb 23 17:03 resultcache-hc-13086-Data.db
> -rw-r--r--  1 root  wheel   772M Feb 23 17:04 resultcache-hc-13087-Data.db
> -rw-r--r--  1 root  wheel   156M Feb 23 17:06 resultcache-hc-13091-Data.db
> -rw-r--r--  1 root  wheel   716M Feb 23 17:18 resultcache-hc-13096-Data.db
> -rw-r--r--  1 root  wheel   734M Feb 23 17:29 resultcache-hc-13101-Data.db
> -rw-r--r--  1 root  wheel   5.0G Mar 14 09:38 resultcache-hc-13923-Data.db
> -rw-r--r--  1 root  wheel   1.9G Mar 16 22:41 resultcache-hc-14084-Data.db
> -rw-r--r--  1 root  wheel   1.9G Mar 21 15:11 resultcache-hc-14460-Data.db
> -rw-r--r--  1 root  wheel   1.9G Mar 27 05:22 resultcache-hc-14694-Data.db
> -rw-r--r--  1 root  wheel   2.0G Mar 31 04:57 resultcache-hc-14851-Data.db
> -rw-r--r--  1 root  wheel   112M Mar 31 06:30 resultcache-hc-14922-Data.db
> -rw-r--r--  1 root  wheel   577M Apr  1 19:25 resultcache-hc-14943-Data.db
>
> compaction strategy needs to compact sstables by timestamp too. older tables
> should have increased chance to get compacted.
> for example - table from today will be compacted with other table in range
> (0.5-1.5) of its size, and this range will get increased with sstable age. -
> 1 month old will have range for example (0.2 - 1.8).



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message