Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A05D79DC5 for ; Mon, 4 Jun 2012 18:34:54 +0000 (UTC) Received: (qmail 93680 invoked by uid 500); 4 Jun 2012 18:34:52 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 93654 invoked by uid 500); 4 Jun 2012 18:34:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 93644 invoked by uid 99); 4 Jun 2012 18:34:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jun 2012 18:34:52 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a81.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jun 2012 18:34:43 +0000 Received: from homiemail-a81.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a81.g.dreamhost.com (Postfix) with ESMTP id 07293A806A for ; Mon, 4 Jun 2012 11:34:21 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=xlJK56zjhk fm2+oBA44tqMvyXe9uf1uAC67JvtHyzHzgMeliGx/O6NWYSut8yHL0xP8bTYKPjL 52I2/MIe10CcssqDJWUIM8XLZjHWF1lrPhxEoHh9Rb18W2+9RRKh504QlvmSPu2r Yh+oDFcgEIviwgqKoMBqsZyW8L3CPIutQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=rJ+fuaWVKniRk18f 6P54b2ARjrg=; b=2EWKpM935zfWX+1PixRTlUyb2JbNUxKUG1A3jkfTqhRSG3iE ++PmiWGFaEDgbO71IEEzKvYl0sWXfl47vq25TbhlixKkvNKjjEBO2cJPVh6XFBkp cHDal3RwdThbEJ524C1zbAFDwtQRtuFe2QoC1q5ZqlVoKJ70WplIXA98AIg= Received: from [172.16.1.4] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a81.g.dreamhost.com (Postfix) with ESMTPSA id 52B4AA8061 for ; Mon, 4 Jun 2012 11:34:20 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: multipart/alternative; boundary="Apple-Mail=_7C817C1C-E27B-48F0-91BD-9C4C46E7F364" Subject: Re: Secondary Indexes, Quorum and Cluster Availability Date: Tue, 5 Jun 2012 06:34:17 +1200 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: X-Mailer: Apple Mail (2.1278) --Apple-Mail=_7C817C1C-E27B-48F0-91BD-9C4C46E7F364 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii IIRC index slices work a little differently with consistency, they need = to have CL level nodes available for all token ranges. If you drop it to = CL ONE the read is local only for a particular token range.=20 The problem when doing index reads is the nodes that contain the results = can no longer be selected by the partitioner.=20 Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 2/06/2012, at 5:15 AM, Jim Ancona wrote: > Hi, >=20 > We have an application with two code paths, one of which uses a = secondary index query and the other, which doesn't. While testing node = down scenarios in our cluster we got a result which surprised (and = concerned) me, and I wanted to find out if the behavior we observed is = expected. >=20 > Background: > 6 nodes in the cluster (in order: A, B, C, E, F and G) > RF =3D 3 > All operations at QUORUM > Operation 1: Read by row key followed by write > Operation 2: Read by secondary index, followed by write > While running a mixed workload of operations 1 and 2, we got the = following results: >=20 > Scenario Result > All nodes up All operations succeed > One node down All operations succeed > Nodes A and E down All operations succeed > Nodes A and B down Operation 1: ~33% fail > Operation 2: All fail > Nodes A and C down Operation 1: ~17% fail > Operation 2: All fail >=20 > We had expected (perhaps incorrectly) that the secondary index reads = would fail in proportion to the portion of the ring that was unable to = reach quorum, just as the row key reads did. For both operation types = the underlying failure was an UnavailableException. >=20 > The same pattern repeated for the other scenarios we tried. The row = key operations failed at the expected ratios, given the portion of the = ring that was unable to meet quorum because of nodes down, while all the = secondary index reads failed as soon as 2 out of any 3 adjacent nodes = were down. >=20 > Is this an expected behavior? Is it documented anywhere? I didn't find = it with a quick search. >=20 > The operation doing secondary index query is an important one for our = app, and we'd really prefer that it degrade gracefully in the face of = cluster failures. My plan at this point is to do that query at = ConsistencyLevel.ONE (and accept the increased risk of inconsistency). = Will that work? >=20 > Thanks in advance, >=20 > Jim --Apple-Mail=_7C817C1C-E27B-48F0-91BD-9C4C46E7F364 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii IIRC = index slices work a little differently with consistency, they need to = have CL level nodes available for all token ranges. If you drop it to CL = ONE the read is local only for a particular token = range. 

The problem when doing index reads is = the nodes that contain the results can no longer be selected by the = partitioner. 

Cheers

http://www.thelastpickle.com

On 2/06/2012, at 5:15 AM, Jim Ancona wrote:

Hi,

We have an application with two code paths, one = of which uses a secondary index query and the other, which doesn't. = While testing node down scenarios in our cluster we got a result which = surprised (and concerned) me, and I wanted to find out if the behavior = we observed is expected.

Background:
  • 6 nodes in the cluster (in order: A, B, C, E, = F and G)
  • RF =3D 3
  • All operations at = QUORUM
  • Operation 1: Read by row key followed by = write
  • Operation 2: Read by secondary index, followed by = write
While running a mixed workload of operations 1 and 2, we got the = following results:

Scenario Result
All nodes up All operations succeed
One node down All operations succeed
Nodes A and E down All operations succeed
Nodes A and B down Operation 1: ~33% fail
Operation 2: All fail
Nodes A and C down Operation 1: ~17% fail
Operation 2: All = fail

We had expected (perhaps = incorrectly) that the secondary index reads would fail in proportion to = the portion of the ring that was unable to reach quorum, just as the row = key reads did. For both operation types the underlying failure was = an UnavailableException.

The same pattern repeated for the other scenarios we = tried. The row key operations failed at the expected ratios, given the = portion of the ring that was unable to meet quorum because of nodes = down, while all the secondary index reads failed as soon as 2 out of any = 3 adjacent nodes were down.

Is this an expected behavior? Is it documented = anywhere? I didn't find it with a quick = search.

The operation doing secondary = index query is an important one for our app, and we'd really prefer that = it degrade gracefully in the face of cluster failures. My plan at this = point is to do that query at ConsistencyLevel.ONE (and accept the = increased risk of inconsistency). Will that work?

Thanks in = advance,

Jim

= --Apple-Mail=_7C817C1C-E27B-48F0-91BD-9C4C46E7F364--