cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2897) Secondary indexes without read-before-write
Date Tue, 19 Jul 2011 06:37:58 GMT


Sylvain Lebresne commented on CASSANDRA-2897:

bq. If the index isn't upto date and index clauses are the primary way of pulling rows then
this data will never be repaired.

Well, right now I'm pretty sure we always query the actual rows (including at least the 'state'
column). But arguably this could be optimized out if we know that the predicate is provably
empty in our current scheme, this would be optimizable in the one proposed here. It's clearly
a trade-off. I do believe it would likely be a win overall (without having any certainty though),
but I'm biased in that I hate that synchronized read-before-write thing.  

> Secondary indexes without read-before-write
> -------------------------------------------
>                 Key: CASSANDRA-2897
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Sylvain Lebresne
>            Priority: Minor
>              Labels: secondary_index
> Currently, secondary index updates require a read-before-write to maintain the index
consistency. Keeping the index consistent at all time is not necessary however. We could let
the (secondary) index get inconsistent on writes and repair those on reads. This would be
easy because on reads, we make sure to request the indexed columns anyway, so we can just
skip the row that are not needed and repair the index at the same time.
> This does trade work on writes for work on reads. However, read-before-write is sufficiently
costly that it will likely be a win overall.
> There is (at least) two small technical difficulties here though:
> # If we repair on read, this will be racy with writes, so we'll probably have to synchronize
> # We probably shouldn't only rely on read to repair and we should also have a task to
repair the index for things that are rarely read. It's unclear how to make that low impact

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message