cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [10/34] cassandra git commit: Docs for Memtable and SSTable architecture
Date Mon, 27 Jun 2016 18:34:05 GMT
Docs for Memtable and SSTable architecture


Branch: refs/heads/trunk
Commit: 7bf837cae0a24f72428becb739102521042ed0ad
Parents: 8d2bd0d
Author: Tyler Hobbs <>
Authored: Fri Jun 17 15:10:09 2016 -0500
Committer: Sylvain Lebresne <>
Committed: Tue Jun 21 14:12:59 2016 +0200

 doc/source/architecture.rst | 53 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 51 insertions(+), 2 deletions(-)
diff --git a/doc/source/architecture.rst b/doc/source/architecture.rst
index 3f8a8ca..cb52477 100644
--- a/doc/source/architecture.rst
+++ b/doc/source/architecture.rst
@@ -145,20 +145,69 @@ throughput, latency, and availability.
 Storage Engine
+.. _commit-log:
 .. todo:: todo
+.. _memtables:
-.. todo:: todo
+Memtables are in-memory structures where Cassandra buffers writes.  In general, there is
one active memtable per table.
+Eventually, memtables are flushed onto disk and become immutable `SSTables`_.  This can be
triggered in several
+- The memory usage of the memtables exceeds the configured threshold  (see ``memtable_cleanup_threshold``)
+- The :ref:`commit-log` approaches its maximum size, and forces memtable flushes in order
to allow commitlog segments to
+  be freed
+Memtables may be stored entirely on-heap or partially off-heap, depending on ``memtable_allocation_type``.
-.. todo:: todo
+SSTables are the immutable data files that Cassandra uses for persisting data on disk.
+As SSTables are flushed to disk from :ref:`memtables` or are streamed from other nodes, Cassandra
triggers compactions
+which combine multiple SSTables into one.  Once the new SSTable has been written, the old
SSTables can be removed.
+Each SSTable is comprised of multiple components stored in separate files:
+  The actual data, i.e. the contents of rows.
+  An index from partition keys to positions in the ``Data.db`` file.  For wide partitions,
this may also include an
+  index to rows within a partition.
+  A sampling of (by default) every 128th entry in the ``Index.db`` file.
+  A Bloom Filter of the partition keys in the SSTable.
+  Metadata about the offsets and lengths of compression chunks in the ``Data.db`` file.
+  Stores metadata about the SSTable, including information about timestamps, tombstones,
clustering keys, compaction,
+  repair, compression, TTLs, and more.
+  A CRC-32 digest of the ``Data.db`` file.
+  A plain text list of the component files for the SSTable.
+Within the ``Data.db`` file, rows are organized by partition.  These partitions are sorted
in token order (i.e. by a
+hash of the partition key when the default partitioner, ``Murmur3Partition``, is used). 
Within a partition, rows are
+stored in the order of their clustering keys.
+SSTables can be optionally compressed using block-based compression.

View raw message