cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Ancona <>
Subject Re: Secondary Indexes, Quorum and Cluster Availability
Date Tue, 05 Jun 2012 20:30:16 GMT
On Mon, Jun 4, 2012 at 2:34 PM, aaron morton <>wrote:

> IIRC index slices work a little differently with consistency, they need to
> have CL level nodes available for all token ranges. If you drop it to CL
> ONE the read is local only for a particular token range.

Yes, this is what we observed. When I reasoned my way through what I knew
about how secondary indexes work, I came to the same conclusion about all
token ranges having to be available.

My surprise at the behavior was because I *hadn't* reasoned my way through
it until we had the issue. Somehow I doubt I'm the only user of secondary
indexes that was unaware of this ramification of CL choice. It might be a
good idea for the documentation to reflect the tradeoffs more clearly.

Thanks for you help!


> The problem when doing index reads is the nodes that contain the results
> can no longer be selected by the partitioner.

> Cheers
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> On 2/06/2012, at 5:15 AM, Jim Ancona wrote:
> Hi,
> We have an application with two code paths, one of which uses a secondary
> index query and the other, which doesn't. While testing node down scenarios
> in our cluster we got a result which surprised (and concerned) me, and I
> wanted to find out if the behavior we observed is expected.
> Background:
>    - 6 nodes in the cluster (in order: A, B, C, E, F and G)
>    - RF = 3
>    - All operations at QUORUM
>    - Operation 1: Read by row key followed by write
>    - Operation 2: Read by secondary index, followed by write
> While running a mixed workload of operations 1 and 2, we got the following
> results:
>  * Scenario* * Result* All nodes up All operations succeed One node downAll operations
succeedNodes A and E downAll operations succeedNodes A and B downOperation 1: ~33% fail
> Operation 2: All fail Nodes A and C down Operation 1: ~17% fail
> Operation 2: All fail
> We had expected (perhaps incorrectly) that the secondary index reads would
> fail in proportion to the portion of the ring that was unable to reach
> quorum, just as the row key reads did. For both operation types the
> underlying failure was an UnavailableException.
> The same pattern repeated for the other scenarios we tried. The row key
> operations failed at the expected ratios, given the portion of the ring
> that was unable to meet quorum because of nodes down, while all the
> secondary index reads failed as soon as 2 out of any 3 adjacent nodes were
> down.
> Is this an expected behavior? Is it documented anywhere? I didn't find it
> with a quick search.
> The operation doing secondary index query is an important one for our app,
> and we'd really prefer that it degrade gracefully in the face of cluster
> failures. My plan at this point is to do that query at ConsistencyLevel.ONE
> (and accept the increased risk of inconsistency). Will that work?
> Thanks in advance,
> Jim

View raw message