From Dave Brosius <>
Subject Re: SSTable format
Date Sat, 14 Jul 2012 00:18:23 GMT
On 07/13/2012 08:00 PM, Michael Theroux wrote:
> Hello,
> I've been trying to understand in greater detail how SStables are stored, and how information
is transferred between Cassandra nodes, especially when a new node is joining a cluster.
> Specifically, Is information stored to SStables ordered by rowkeys?  Some of the articles
I've read suggests this is the case (although it's a little vague if they actually mean that
the columns are stored in order, not the rowkeys).  However, if data is stored in rowkey order,
how is this achieved, as sstables are immutable?
> Thanks for any insights,
> -Mike

It depends on what partitioner you use. You should be using the 
RandomPartitioner, and if so, the rows are sorted by the hash of the row 
key. there are partitioners that sort based on the raw key value but 
these partitioners shouldn't be used as they have problems due to uneven 
partitioning of data.

As for how this is done, remember an sstable doesn't hold all the data 
for a column family. Not only does the data for a column family exist on 
multiple servers, there are usually multiple sstable files on disk that 
represent data from one column family on one machine. So at the time the 
sstable is written, the rows that are to be put in the sstable are 
sorted, and written in sorted order. In fact the same rowkey may be 
written in multiple sstables, one sstable having one set of columns for 
the key, the other sstable having other columns for the same key.

On query for some row based on a key, cassandra is responsible for 
finding where the columns are found in which sstables (potentially 
several) and merging the results.

