Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of dan.foody@gmail.com
 designates 209.85.216.172 as permitted sender)
From: Dan Foody <dan.foody@gmail.com>
Mime-Version: 1.0 (Apple Message framework v1278)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_D2EFC395-C43F-492D-A3C5-60A59D465195"
Subject: Re: Expanding Cassandra on EC2 with consistency
Date: Tue, 3 Jul 2012 21:56:32 -0400
In-Reply-To: 
 <CAMt1n-DvytxAd7M+dFEX8JFGLdH4HcgGv1Jtcd+3Db4+hT4LeA@mail.gmail.com>
To: user@cassandra.apache.org
References: <D894C574-DCF7-4FEE-932A-82E55CB5AA12@yahoo.com>
 <CAMt1n-DvytxAd7M+dFEX8JFGLdH4HcgGv1Jtcd+3Db4+hT4LeA@mail.gmail.com>
Message-Id: <47D9EE5E-2AA3-4B79-88B6-A5D8C8390B0A@gmail.com>


--Apple-Mail=_D2EFC395-C43F-492D-A3C5-60A59D465195
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

Hi Alex,

Can you share what replication factor you're running?
And, are you using ephemeral disks or EBS volumes?

Thanks!

- Dan


On Jul 3, 2012, at 5:52 PM, Alex Major wrote:

> Hi Mike,
>=20
> We've run a small (4 node) cluster in the EU region since September =
last year. We run across all 3 availability zones in the EU region, with =
2 nodes in one AZ and then a further node in each AZ. The latency =
difference between running inside of and between AZ's has been minimal =
in our experience.=20
>=20
> It's only when we've gone cross-region that there's been latency =
problem. We temporarily ran a 9 node cluster across 3 regions, however =
even then using local quoram the latency was better than the standard =
datacenter - datacenter latency we're used to.
>=20
> EC2Snitch is definitely the way to go in favour of NTS in my opinion. =
NTS was a pain to get setup with the internal (private) IP address =
setup, so much so that we never got it safely replicating the data as we =
wanted.
>=20
> Alex.
>=20
> On Tue, Jul 3, 2012 at 2:16 PM, Michael Theroux <mtheroux2@yahoo.com> =
wrote:
> Hello,
>=20
> We are currently running a web application utilizing Cassandra on EC2. =
 Given the recent outages experienced with Amazon, we want to consider =
expanding Cassandra across availability zones sooner rather than later.
>=20
> We are trying to determine the optimal way to deploy Cassandra in this =
deployment.  We are researching the NetworkTopologyStrategy, and the =
EC2Snitch.  We are also interested in providing a high level of read or =
write consistency,
>=20
> My understanding is that the EC2Snitch recognizes availability zones =
as racks, and regions as data-centers.  This seems to be a common =
configuration.  However, if we were to want to utilize queries with a =
READ or WRITE consistency of QUORUM, would there be a high possibility =
that the communication necessary to establish a quorum, across =
availability zones?
>=20
> My understanding is that the NetworkTopologyStrategy attempts to =
prefer replicas be stored on other racks within the datacenter, which =
would equate to other availability zones in EC2.  This implies to me =
that in order to have the quorum of nodes necessary to achieve =
consistency, that Cassandra will communicate with nodes across =
availability zones.
>=20
> First, is my understanding correct?  Second, given the high latency =
that can sometimes exists between availability zones, is this a problem, =
and instead we should treat availability zones as data centers?
>=20
> Ideally, we would be able to setup a situation where we could store =
replicas across availability zones in case of failure, but establish a =
high level of read or write consistency within a single availability =
zone.
>=20
> I appreciate your responses,
> Thanks,
> -Mike
>=20
>=20
>=20
>=20


--Apple-Mail=_D2EFC395-C43F-492D-A3C5-60A59D465195
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi =
Alex,<div><br></div><div>Can you share what replication factor you're =
running?</div><div>And, are you using ephemeral disks or EBS =
volumes?</div><div><br></div><div>Thanks!<br><div>
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-align: -webkit-auto; text-indent: 0px; text-transform: none; =
white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div><br =
class=3D"Apple-interchange-newline">- =
Dan</div><div><br></div></span></div></span></span><br =
class=3D"Apple-interchange-newline">
</div>
<br><div><div>On Jul 3, 2012, at 5:52 PM, Alex Major wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite">Hi =
Mike,<div><br></div><div>We've run a small (4 node) cluster in the EU =
region since September last year. We run across all =
3&nbsp;availability&nbsp;zones in the EU region, with 2 nodes in one AZ =
and then a further node in each AZ. The latency difference between =
running inside of and between AZ's has been minimal in our =
experience.&nbsp;</div>
<div><br></div><div>It's only when we've gone cross-region that there's =
been latency problem. We temporarily ran a 9 node cluster across 3 =
regions, however even then using local quoram the latency was better =
than the standard datacenter - datacenter latency we're used to.</div>
<div><br></div><div>EC2Snitch is&nbsp;definitely&nbsp;the way to go in =
favour of NTS in my&nbsp;opinion. NTS was a pain to get setup with the =
internal (private) IP address setup, so much so that we never got it =
safely replicating the data as we wanted.</div>
<div><br></div><div>Alex.</div><div><br><div class=3D"gmail_quote">On =
Tue, Jul 3, 2012 at 2:16 PM, Michael Theroux <span dir=3D"ltr">&lt;<a =
href=3D"mailto:mtheroux2@yahoo.com" =
target=3D"_blank">mtheroux2@yahoo.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>
<br>
We are currently running a web application utilizing Cassandra on EC2. =
&nbsp;Given the recent outages experienced with Amazon, we want to =
consider expanding Cassandra across availability zones sooner rather =
than later.<br>
<br>
We are trying to determine the optimal way to deploy Cassandra in this =
deployment. &nbsp;We are researching the NetworkTopologyStrategy, and =
the EC2Snitch. &nbsp;We are also interested in providing a high level of =
read or write consistency,<br>

<br>
My understanding is that the EC2Snitch recognizes availability zones as =
racks, and regions as data-centers. &nbsp;This seems to be a common =
configuration. &nbsp;However, if we were to want to utilize queries with =
a READ or WRITE consistency of QUORUM, would there be a high possibility =
that the communication necessary to establish a quorum, across =
availability zones?<br>

<br>
My understanding is that the NetworkTopologyStrategy attempts to prefer =
replicas be stored on other racks within the datacenter, which would =
equate to other availability zones in EC2. &nbsp;This implies to me that =
in order to have the quorum of nodes necessary to achieve consistency, =
that Cassandra will communicate with nodes across availability =
zones.<br>

<br>
First, is my understanding correct? &nbsp;Second, given the high latency =
that can sometimes exists between availability zones, is this a problem, =
and instead we should treat availability zones as data centers?<br>
<br>
Ideally, we would be able to setup a situation where we could store =
replicas across availability zones in case of failure, but establish a =
high level of read or write consistency within a single availability =
zone.<br>
<br>
I appreciate your responses,<br>
Thanks,<br>
-Mike<br>
<br>
<br>
<br>
</blockquote></div><br></div>
</blockquote></div><br></div></body></html>=

--Apple-Mail=_D2EFC395-C43F-492D-A3C5-60A59D465195--