Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from
	:mime-version:content-type:subject:date:in-reply-to:to
	:references:message-id; q=dns; s=thelastpickle.com; b=xlJK56zjhk
	fm2+oBA44tqMvyXe9uf1uAC67JvtHyzHzgMeliGx/O6NWYSut8yHL0xP8bTYKPjL
	52I2/MIe10CcssqDJWUIM8XLZjHWF1lrPhxEoHh9Rb18W2+9RRKh504QlvmSPu2r
	Yh+oDFcgEIviwgqKoMBqsZyW8L3CPIutQ=
From: aaron morton <aaron@thelastpickle.com>
Mime-Version: 1.0 (Apple Message framework v1278)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_7C817C1C-E27B-48F0-91BD-9C4C46E7F364"
Subject: Re: Secondary Indexes, Quorum and Cluster Availability
Date: Tue, 5 Jun 2012 06:34:17 +1200
In-Reply-To: 
 <CAKYY9ALtVJvV6sYygygbVbDn8mM92wmBhk2fxo4wLwrMeG5d+A@mail.gmail.com>
To: user@cassandra.apache.org
References: 
 <CAKYY9ALtVJvV6sYygygbVbDn8mM92wmBhk2fxo4wLwrMeG5d+A@mail.gmail.com>
Message-Id: <E2EF9A31-66EC-4A15-9307-20EA92B70D5C@thelastpickle.com>


--Apple-Mail=_7C817C1C-E27B-48F0-91BD-9C4C46E7F364
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

IIRC index slices work a little differently with consistency, they need =
to have CL level nodes available for all token ranges. If you drop it to =
CL ONE the read is local only for a particular token range.=20

The problem when doing index reads is the nodes that contain the results =
can no longer be selected by the partitioner.=20

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 2/06/2012, at 5:15 AM, Jim Ancona wrote:

> Hi,
>=20
> We have an application with two code paths, one of which uses a =
secondary index query and the other, which doesn't. While testing node =
down scenarios in our cluster we got a result which surprised (and =
concerned) me, and I wanted to find out if the behavior we observed is =
expected.
>=20
> Background:
> 6 nodes in the cluster (in order: A, B, C, E, F and G)
> RF =3D 3
> All operations at QUORUM
> Operation 1: Read by row key followed by write
> Operation 2: Read by secondary index, followed by write
> While running a mixed workload of operations 1 and 2, we got the =
following results:
>=20
> Scenario	 Result
> All nodes up	 All operations succeed
> One node down	 All operations succeed
> Nodes A and E down	 All operations succeed
> Nodes A and B down	 Operation 1: ~33% fail
> Operation 2: All fail
> Nodes A and C down	 Operation 1: ~17% fail
> Operation 2: All fail
>=20
> We had expected (perhaps incorrectly) that the secondary index reads =
would fail in proportion to the portion of the ring that was unable to =
reach quorum, just as the row key reads did. For both operation types =
the underlying failure was an UnavailableException.
>=20
> The same pattern repeated for the other scenarios we tried. The row =
key operations failed at the expected ratios, given the portion of the =
ring that was unable to meet quorum because of nodes down, while all the =
secondary index reads failed as soon as 2 out of any 3 adjacent nodes =
were down.
>=20
> Is this an expected behavior? Is it documented anywhere? I didn't find =
it with a quick search.
>=20
> The operation doing secondary index query is an important one for our =
app, and we'd really prefer that it degrade gracefully in the face of =
cluster failures. My plan at this point is to do that query at =
ConsistencyLevel.ONE (and accept the increased risk of inconsistency). =
Will that work?
>=20
> Thanks in advance,
>=20
> Jim


--Apple-Mail=_7C817C1C-E27B-48F0-91BD-9C4C46E7F364
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">IIRC =
index slices work a little differently with consistency, they need to =
have CL level nodes available for all token ranges. If you drop it to CL =
ONE the read is local only for a particular token =
range.&nbsp;<div><br></div><div>The problem when doing index reads is =
the nodes that contain the results can no longer be selected by the =
partitioner.&nbsp;</div><div><br></div><div>Cheers</div><div><br></div><di=
v><div apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></div></span></span>
</div>

<br><div><div>On 2/06/2012, at 5:15 AM, Jim Ancona wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">Hi,<br><br>We have an application with two code paths, one =
of which uses a secondary index query and the other, which doesn't. =
While testing node down scenarios in our cluster we got a result which =
surprised (and concerned) me, and I wanted to find out if the behavior =
we observed is expected.<br>


<br>Background:<br><ul><li>6 nodes in the cluster (in order: A, B, C, E, =
F and G)</li><li>RF =3D 3</li><li>All operations at =
QUORUM</li><li>Operation 1: Read by row key followed by =
write</li><li>Operation 2: Read by secondary index, followed by =
write</li>


</ul>While running a mixed workload of operations 1 and 2, we got the =
following results:<div><br><div><table style=3D"font-family:'Droid =
Serif',serif;line-height:26px;word-spacing:2px;font-size:medium">
<tbody><tr><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">

<b>
Scenario</b></td><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">

<b>
Result</b></td></tr><tr><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">


All nodes up</td><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">


All operations succeed</td></tr><tr><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">


One node down</td><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">


All operations succeed</td></tr><tr><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">


Nodes A and E down</td><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">


All operations succeed</td></tr><tr><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">

Nodes A and B down</td><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">

Operation 1: ~33% fail<br>Operation 2: All fail</td></tr><tr><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">

Nodes A and C down</td><td =
style=3D"border-top-width:1px;border-right-width:1px;border-bottom-width:1=
px;border-left-width:1px;border-top-style:solid;border-right-style:solid;b=
order-bottom-style:solid;border-left-style:solid;border-top-color:black;bo=
rder-right-color:black;border-bottom-color:black;border-left-color:black;v=
ertical-align:top">

Operation 1: ~17% fail<br>Operation 2: All =
fail</td></tr></tbody></table><br></div><div>We had expected (perhaps =
incorrectly) that the secondary index reads would fail in proportion to =
the portion of the ring that was unable to reach quorum, just as the row =
key reads did. For both operation types the underlying failure was =
an&nbsp;UnavailableException.</div>

<div><br></div><div>The same pattern repeated for the other scenarios we =
tried. The row key operations failed at the expected ratios, given the =
portion of the ring that was unable to meet quorum because of nodes =
down, while all the secondary index reads failed as soon as 2 out of any =
3 adjacent nodes were down.</div>

<div><br></div><div>Is this an expected behavior? Is it documented =
anywhere? I didn't find it with a quick =
search.</div><div><br></div><div>The&nbsp;operation doing&nbsp;secondary =
index query is an important one for our app, and we'd really prefer that =
it degrade gracefully in the face of cluster failures. My plan at this =
point is to do that query at ConsistencyLevel.ONE (and accept the =
increased risk of inconsistency). Will that work?</div>

<div><br></div><div>Thanks in =
advance,</div><div><br></div><div>Jim</div>
</div>
</blockquote></div><br></div></body></html>=

--Apple-Mail=_7C817C1C-E27B-48F0-91BD-9C4C46E7F364--