When the row is stored on disk as SSTable, the complexity of getting a row
is constant, as it always know where to get the row by inmemory indices.
When the row is stored in memory as memtable, it is stored as skip list[1].
The complexity is O(logN). N is the total number of rows in the skip list.
To get a column inside a row, the complexity is similar. When there are many
columns in a single row, cassandra will build indices for these columns when
write the row to SStable. This controlled by "ColumnIndexSizeInKB".
If I made any mistake, please correct me. Thank you!
[1] http://en.wikipedia.org/wiki/Skip_list
best regards,
hanzhu
On Fri, Dec 24, 2010 at 12:20 PM, Kevin Irwig <kevinirwig@yahoo.com> wrote:
> Hi,
>
> I've seen a similar question has been asked in this forum in Sept, but not
> answered.
>
> What is the complexity of get(row) and get(row, columnname) operations,
> and
> insert(row, column)? What about accessing or inserting a column within a
> SuperColumn by name?
>
> In Arin Sarkissian's "WTF is a Supercolumn" he stresses that columns are
> stored
> sorted by name. Are they simply stored in a sorted list, requiring a search
> when
> inserting and accessing (I forget the worst case complexity of
> searching/inserting into/deleting from a sorted list), or in some data
> structure
> with faster access? Even if you don't know the bigO complexity, a
> description
> of the implementation data structure(s), and discussion of what is fast and
> what
> is not, would help.
>
> Thanks for your help,
> Kevin.
>
>
>
>
>
