cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Rao (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-172) A improved and more general version of get_slice
Date Mon, 18 May 2009 18:45:45 GMT


Jun Rao commented on CASSANDRA-172:

3. Within an SSTable, for a given (key,CF) pair, all columns are stored contiguously in a
segment. The column index is created roughly as follows. Everytime the size of the accumulated
columns reaches a threshold, a column index entry is created, which has the column name and
an offset within the segment. I call this a (column) block index. In the code, instead of
reading all columns from an SSTable at once, I read the columns a block at a time, using the
column index. This is more efficient if the number of columns in an SSTable is large (that's
the key difference between this api and get_slice). Depending on how many columns are needed
in this api, multiple column blocks will need to be fetched from a sinlge SSTable. Does that
explain things?

> A improved and more general version of get_slice
> ------------------------------------------------
>                 Key: CASSANDRA-172
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.4
>         Attachments: get_slice_from.patchv1, get_slice_from.patchv2
> Today, get_slice has to scan through all columns in every memtable and sstable to get
a slice of columns. This becomes inefficient when the number of columns in a row is large.
We need a more efficient API.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message