incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject Re: complexity
Date Fri, 24 Dec 2010 11:02:22 GMT
>> When the row is stored on disk as SSTable, the complexity of getting a row
>> is constant, as it always know where to get the row by in-memory indices.
>
> BTW: not the whole indices are kept in memory, just part of them are. This
> is controlled by "IndexInterval".  That is, 1/IndexInterval of whole indices
> are kept in memory. The default value is 128.

And the reason why the cost is constant, is that there is a single
seek (logically) done to the index, followed by a single seek to the
data.

The index sampling controls the granularity of the seek in the index.
With the default value of 128, Cassandra may need to read and
deserialize up to 128 index entries before finding the one being
looked for, but that data is read sequentially from disk.

(Whether or not this translates into exactly one seek in reality will
be dependent on several factors such as read-ahead logic. But it
scales constantly with respect to data size - at the cost of some
memory (the index sampling))

-- 
/ Peter Schuller

Mime
View raw message