cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Doubleday (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-2498) Improve read performance in update-intensive workload
Date Wed, 13 Jul 2011 19:25:00 GMT


Daniel Doubleday updated CASSANDRA-2498:

    Attachment: supersede-name-filter-collations.patch

Took a shot at this one.

I saw 2 ways of doing this:

- Implementing lazy versions of column iterators
- Doing multiple collations while reducing the filter columns when one can be sure that the
column will supersede

Problem with second choice is that everything in the query filter code assumes that all column
iterators are collated at once but since first choice seemed to be a lot of effort I tried
multiple collations anyway.

So that's the plan:

- collect and collate mem tables
- while there are columns in the filter that are not known to supersede any other iterate
over sorted sstabled and remove all cols that supersede. 
- stop if no cols are left in the filter

Everything takes place in a new CollationController.

So far I found one ugly edge case that comes up with system table that have 0 grace time.

If you guys think that's a worthwhile approach I'll provide tests. Standard test suite obviously

I guess it would be easy to extend that approach to slice filters and skinny rows when implementing
CASSANDRA-2503. The only thing that would be needed is a superseding timestamp in the header
of a row / CF same as DeletionInfo and probably a configuration option per CF. If too many
sstable are read than one could compact that row and put it in memtable.

Also with a little (or more) work it might be interesting to see if superseding range information
could be stored on block level (row index).

> Improve read performance in update-intensive workload
> -----------------------------------------------------
>                 Key: CASSANDRA-2498
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: ponies
>             Fix For: 1.0
>         Attachments: supersede-name-filter-collations.patch
> Read performance in an update-heavy environment relies heavily on compaction to maintain
good throughput. (This is not the case for workloads where rows are only inserted once, because
the bloom filter keeps us from having to check sstables unnecessarily.)
> Very early versions of Cassandra attempted to mitigate this by checking sstables in descending
generation order (mostly equivalent to descending mtime): once all the requested columns were
found, it would not check any older sstables.
> This was incorrect, because data timestamp will not correspond to sstable timestamp,
both because compaction has the side effect of "refreshing" data to a newer sstable, and because
hintead handoff may send us data older than what we already have.
> Instead, we could create a per-sstable piece of metadata containing the most recent (client-specified)
timestamp for any column in the sstable.  We could then sort sstables by this timestamp instead,
and perform a similar optimization (if the remaining sstable client-timestamps are older than
the oldest column found in the desired result set so far, we don't need to look further).
Since under almost every workload, client timestamps of data in a given sstable will tend
to be similar, we expect this to cut the number of sstables down proportionally to how frequently
each column in the row is updated. (If each column is updated with each write, we only have
to check a single sstable.)
> This may also be useful information when deciding which SSTables to compact.
> (Note that this optimization is only appropriate for named-column queries, not slice
queries, since we don't know what non-overlapping columns may exist in older sstables.)

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message