cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "MemtableSSTable" by JonathanEllis
Date Mon, 13 Sep 2010 14:08:45 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "MemtableSSTable" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/MemtableSSTable?action=diff&rev1=14&rev2=15

--------------------------------------------------

  == Compaction ==
  To bound the number of SSTable files that must be consulted on reads, and to reclaim [[DistributedDeletes|space
taken by unused data]], Cassandra performs compactions: merging multiple old SSTable files
into a single new one. Compactions are triggered when at least N SStables have been flushed
to disk, where N is tunable and defaults to 4. Four similar-sized SSTables are merged into
a single one. They start out being the same size as your memtable flush size, and then form
a hierarchy with each one doubling in size. So you'll have up to N of the same size as your
memtable, then up to N double that size, then up to N double that size, etc.
  
- "Minor" only compactions merge sstables of similar size; "major" compactions merge all sstables
in a given !ColumnFamily.  Only major compactions can clean out obsolete [[DistributedDeletes|tombstones]].
+ "Minor" only compactions merge sstables of similar size; "major" compactions merge all sstables
in a given !ColumnFamily.  Prior to Cassandra 0.6.6/0.7.0, only major compactions can clean
out obsolete [[DistributedDeletes|tombstones]].
  
  Since the input SSTables are all sorted by key, merging can be done efficiently, still requiring
no random i/o.  Once compaction is finished, the old SSTable files may be deleted: note that
in the worst case (a workload consisting of no overwrites or deletes) this will temporarily
require 2x your existing on-disk space used.  In today's world of multi-TB disks this is usually
not a problem but it is good to keep in mind when you are setting alert thresholds.
  

Mime
View raw message