Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of rkalla@gmail.com designates
 209.85.161.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAO5xsd2Kx0_dUUUQTsf2Tmnr0qw-y=QExwj+K3q-Uk8o1eBUXQ@mail.gmail.com>
References: <60E014B3-4010-4987-BD42-CA4242675BB7@gmail.com>
 <1320646450.33642.YahooMailNeo@web95202.mail.in2.yahoo.com>
 <CABn9xAGRTdOTTVNM-RD+Zqdj7BSbDXg3zjxmma2EmAp5APAu0g@mail.gmail.com>
 <477F356C-DD18-46DC-8463-EF278DE9C97B@gmail.com>
 <CABn9xAF8cfp+gxOTegX68QkRdH1FFnEDtnWkuTDV2+MXJYc9Cg@mail.gmail.com>
 <CA+nPnMxoJcrWNTF1V7qOcrYD1p=d=_f0BYHo55T=50U3xcsiXw@mail.gmail.com>
 <CABn9xAEy24jUkbpn-pOMv4jt5FDGYoPXhW0NQ7i9XTmKPT8+ag@mail.gmail.com>
 <CAO5xsd2Kx0_dUUUQTsf2Tmnr0qw-y=QExwj+K3q-Uk8o1eBUXQ@mail.gmail.com>
From: Riyad Kalla <rkalla@gmail.com>
Date: Mon, 7 Nov 2011 21:07:05 -0700
Message-ID: 
 <CABn9xAHuVMdppssx+kgKzxHqW5=oKuUBeNpi4i1msTLUQzJ0gA@mail.gmail.com>
Subject: Re: Will writes with < ALL consistency eventually propagate?
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=000e0cd570febc3e8a04b1314ff4

--000e0cd570febc3e8a04b1314ff4
Content-Type: text/plain; charset=ISO-8859-1

Peter,

Thanks for the additional insight on this -- think of a CDN that needs to
respond to requests, distributed around the globe. Ultimately you would
hope that each edge location could respond as quickly as possible (RF=N)
but if each of the ring members keep open/active connections to each other,
and a request comes in to an edge location that does not contain a copy of
the data, does it request the data from the node that does, then cache it
(in the case of more requests coming into that edge location with the same
request) or does it reply once and forget it, requiring *each* subsequent
request to that node to always phone back home to the node that actually
contains it?

The CDN/edge-server scenario works particularly well to illustrate my
goals, if visualizing that helps.

Look forward to your thoughts.

-R

On Mon, Nov 7, 2011 at 8:05 PM, Peter Schuller
<peter.schuller@infidyne.com>wrote:

> > Given that, the way I've understood this discussion so far is I would
> have a
> > RF of N (my total node count) but my Consistency Level with all my writes
> > will *likely* be QUORUM -- I think that is a good/safe default for me to
> use
> > as writes aren't the scenario I need to optimize for latency; that being
> > said, I also don't want to wait for a ConsistencyLevel of ALL to complete
> > before my code continues though.
> > Would you agree with this assessment or am I missing the boat on
> something?
>
> Are you *sure* you care about latency to the degree that data being
> non-local actually matters to your application?
>
> Normally you don't set RF=N unless you have particularly special
> requirements. The extra latency implied by another network round-trip
> is certainly greater than zero, but in many practical situations
> outliers and the behavior in case of e.g. node problems is much more
> important than an extra millisecond or two on the average request.
> Setting RF=N causes a larger data set on each node, in addition to
> causing more nodes to be involved in every request. Consider whether
> it's a better use of resources to set RF to e.g. 3 instead, and let
> the ring grow independently. That is what one normally does.
>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>

--000e0cd570febc3e8a04b1314ff4
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Peter,<div><br></div><div>Thanks for the additional insight on this -- thin=
k of a CDN that needs to respond to requests, distributed around the globe.=
 Ultimately you would hope that each edge location could respond as quickly=
 as possible (RF=3DN) but if each of the ring members keep open/active conn=
ections to each other, and a request comes in to an edge location that does=
 not contain a copy of the data, does it request the data from the node tha=
t does, then cache it (in the case of more requests coming into that edge l=
ocation with the same request) or does it reply once and forget it, requiri=
ng *each* subsequent request to that node to always phone back home to the =
node that actually contains it?</div>

<div><br></div><div>The CDN/edge-server scenario works particularly well to=
 illustrate my goals, if visualizing that helps.</div><div><br></div><div>L=
ook forward to your thoughts.</div><div><br></div><div>-R</div><div><br>

<div class=3D"gmail_quote">On Mon, Nov 7, 2011 at 8:05 PM, Peter Schuller <=
span dir=3D"ltr">&lt;<a href=3D"mailto:peter.schuller@infidyne.com">peter.s=
chuller@infidyne.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex=
;">

<div class=3D"im">&gt; Given that, the way I&#39;ve understood this discuss=
ion so far is I would have a<br>
&gt; RF of N (my total node count) but my Consistency Level with all my wri=
tes<br>
&gt; will *likely* be QUORUM -- I think that is a good/safe default for me =
to use<br>
&gt; as writes aren&#39;t the scenario I need to optimize for latency; that=
 being<br>
&gt; said, I also don&#39;t want to wait for a ConsistencyLevel of ALL to c=
omplete<br>
&gt; before my code continues though.<br>
&gt; Would you agree with this assessment or am I missing the boat on somet=
hing?<br>
<br>
</div>Are you *sure* you care about latency to the degree that data being<b=
r>
non-local actually matters to your application?<br>
<br>
Normally you don&#39;t set RF=3DN unless you have particularly special<br>
requirements. The extra latency implied by another network round-trip<br>
is certainly greater than zero, but in many practical situations<br>
outliers and the behavior in case of e.g. node problems is much more<br>
important than an extra millisecond or two on the average request.<br>
Setting RF=3DN causes a larger data set on each node, in addition to<br>
causing more nodes to be involved in every request. Consider whether<br>
it&#39;s a better use of resources to set RF to e.g. 3 instead, and let<br>
the ring grow independently. That is what one normally does.<br>
<span class=3D"HOEnZb"><font color=3D"#888888"><br>
--<br>
/ Peter Schuller (@scode, <a href=3D"http://worldmodscode.wordpress.com" ta=
rget=3D"_blank">http://worldmodscode.wordpress.com</a>)<br>
</font></span></blockquote></div><br></div>

--000e0cd570febc3e8a04b1314ff4--