incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhu Han <schumi....@gmail.com>
Subject Re: complexity
Date Fri, 24 Dec 2010 10:42:13 GMT
When the row is stored on disk as SSTable, the complexity of getting a row
is constant, as it always know where to get the row by in-memory indices.

When the row is stored in memory as memtable,  it is stored as skip list[1].
The complexity is O(logN).  N is the total number of rows in the skip list.

To get a column inside a row, the complexity is similar. When there are many
columns in a single row, cassandra will build indices for these columns when
write the row to SStable.  This controlled by "ColumnIndexSizeInKB".

If I made any mistake, please correct me. Thank you!

[1] http://en.wikipedia.org/wiki/Skip_list

best regards,
hanzhu


On Fri, Dec 24, 2010 at 12:20 PM, Kevin Irwig <kevinirwig@yahoo.com> wrote:

> Hi,
>
> I've seen a similar question has been asked in this forum in Sept, but not
> answered.
>
> What is the complexity of get(row) and get(row, column-name) operations,
> and
> insert(row, column)? What about accessing or inserting a column within a
> SuperColumn by name?
>
> In Arin Sarkissian's "WTF is a Supercolumn" he stresses that columns are
> stored
> sorted by name. Are they simply stored in a sorted list, requiring a search
> when
> inserting and accessing (I forget the worst case complexity of
> searching/inserting into/deleting from a sorted list), or in some data
> structure
> with faster access? Even if you don't know the big-O complexity, a
> description
> of the implementation data structure(s), and discussion of what is fast and
> what
> is not, would help.
>
> Thanks for your help,
> Kevin.
>
>
>
>
>

Mime
View raw message