cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Trivial Update of "MemtableSSTable" by JonathanEllis
Date Thu, 22 Apr 2010 14:56:28 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "MemtableSSTable" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/MemtableSSTable?action=diff&rev1=8&rev2=9

--------------------------------------------------

  Cassandra writes are first written to the [[Durability|CommitLog]], and then to a per-!ColumnFamily
structure called a Memtable.  A Memtable is basically a write-back cache of data rows that
can be looked up by key -- that is, unlike a write-through cache, writes are batched up in
the Memtable until it is full, before being written to disk as an SSTable.
  
- The process of turning a Memtable into a SSTable is called flushing.  You can manually trigger
flush via jmx (e.g. with bin/nodetool), which you may want to do before restarting nodes since
it will reduce !CommitLog replay time.  Memtables are sorted by key and then written out sequentially.
+ The process of turning a Memtable into a SSTable is called flushing.  You can manually trigger
flush via jmx (e.g. with bin/nodetool), which you may want to do before restarting nodes since
it will reduce !CommitLog replay time.  Memtables are sorted by key and then written out sequentially.
 Thus, writes are extremely fast, costing only a commitlog append and an amortized sequential
write for the flush!
- 
- Thus, writes are extremely fast, costing only a commitlog append and an amortized sequential
write for the flush!
  
  Once flushed, SSTable files are immutable; no further writes may be done.  So, on the read
path, the server must (potentially, although it uses tricks like bloom filters to avoid doing
so unnecessarily) combine row fragments from all the SSTables on disk, as well as any unflushed
Memtables, to produce the requested data.
  
@@ -12, +10 @@

  
  Since the input SSTables are all sorted by key, merging can be done efficiently, still requiring
no random i/o.  Once compaction is finished, the old SSTable files may be deleted: note that
in the worst case (a workload consisting of no overwrites or deletes) this will temporarily
require 2x your existing on-disk space used.  In today's world of multi-TB disks this is usually
not a problem but it is good to keep in mind when you are setting alert thresholds.
  
- SSTables that are obsoleted by a compaction are deleted asynchronously when the JVM performs
a GC.  You can force a GC from jconsole if necessary but this is not necessary; Cassandra
will force one itself if it detects that it is low on space.  A compaction marker is also
added to obsolete sstables so they can be deleted on startup if the server does not perform
a GC before being restarted.
+ SSTables that are obsoleted by a compaction are deleted asynchronously when the JVM performs
a GC.  You can force a GC from jconsole if necessary, but Cassandra will force one itself
if it detects that it is low on space.  A compaction marker is also added to obsolete sstables
so they can be deleted on startup if the server does not perform a GC before being restarted.
  
  CFStoreMBean exposes sstable space used as getLiveDiskSpaceUsed (only includes size of non-obsolete
files) and getTotalDiskSpaceUsed (includes everything).
  

Mime
View raw message