cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Cassandra Wiki] Update of "FileFormatDesignDoc" by StuHood
Date Tue, 18 Jan 2011 08:29:05 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "FileFormatDesignDoc" page has been changed by StuHood.


  Cassandra also needs to encode metadata about tuples and ranges of tuples in order to represent
creation and deletion timestamps. For both value tuples and range tuples, a varying number
(depending on value and range type) of timestamps will need to be encoded.
+ === Types ===
+  * ''deleted'' - If a range/value has been deleted at any point in time, we need to record
that action with timestamps: marking a range/value ''deleted'' indicates that these timestamps
+  * ''standard'' - A standard value has a timestamp to indicate when it was created; a range
will be marked standard if it has children (a standard range without children has no reason
to exist)
+  * ''expiring'' - An expiring value has additional timestamps to indicate when it should
be garbage collected. Currently, only values may be marked expiring.
+  * ''null'' - A range or value entry in a chunk that is only a placeholder. For example,
if a parent has ranges but no values, a ''null'' value will be inserted in a chunk to indicate
to viewers that the parent exists.
  === Range Metadata ===
  Range tuples can be encoded in a very similar fashion to the value tuples represented above,
except that they always come in pairs. It will likely make sense to store them in a separate
blob from the value tuples, since they will bear very little similarity to one another (TODO:
need to confirm with an anecdote or two).
@@ -203, +210 @@

  This example shows a range tombstone for values at level "name1" between 'havarti' and 'muenster':
the chunk for the "name1" level stores a pair of range tuples for the 'cheese' parent and
nulls are stored for parents without any range metadata. The end result is that the span stores
a tombstone from ('cheese', 'havarti', <empty>) to ('cheese', 'muenster', null), where
<empty> is the smallest value, and null is the largest value.
  Note that it is not possible for ranges for a parent to overlap: in this case, the ranges
would be resolved such that the intersection was given the winning timestamp, and the two
remainders would use their original timestamps.
+ ==== Other uses ====
+ In cases where parents are too wide to fit in a single span, range metadata should also
be used to indicate how much of a particular parent is present in a given span.
  ==== Effect of ordering ====

View raw message