Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 913CB9AC6 for ; Tue, 8 Nov 2011 04:08:09 +0000 (UTC) Received: (qmail 32430 invoked by uid 500); 8 Nov 2011 04:08:07 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 31925 invoked by uid 500); 8 Nov 2011 04:08:07 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 31917 invoked by uid 99); 8 Nov 2011 04:08:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Nov 2011 04:08:05 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rkalla@gmail.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gx0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Nov 2011 04:07:57 +0000 Received: by ggnv1 with SMTP id v1so113802ggn.31 for ; Mon, 07 Nov 2011 20:07:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=XyA6L5cHFWRi2ajEU/J7El3g6siLcUW9fEV02+vY56k=; b=kNL0TM06ZiPFndKttiknKlnhtqEPpWsGK5Ull8qreUAaxLAxwQlDcyaBb2QFT6wSFs tktPEKKlrw1h2zEWCGzP3epc9tXEFGMRg4OWsjDp/yIf+uSqABFAEZFXdTEsmAxjmlPF 26cChZoL5cRn+fbbiOuqydAtM0VI2RBNdVQ9g= Received: by 10.146.110.15 with SMTP id i15mr4677659yac.19.1320725256091; Mon, 07 Nov 2011 20:07:36 -0800 (PST) MIME-Version: 1.0 Received: by 10.147.169.19 with HTTP; Mon, 7 Nov 2011 20:07:05 -0800 (PST) In-Reply-To: References: <60E014B3-4010-4987-BD42-CA4242675BB7@gmail.com> <1320646450.33642.YahooMailNeo@web95202.mail.in2.yahoo.com> <477F356C-DD18-46DC-8463-EF278DE9C97B@gmail.com> From: Riyad Kalla Date: Mon, 7 Nov 2011 21:07:05 -0700 Message-ID: Subject: Re: Will writes with < ALL consistency eventually propagate? To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=000e0cd570febc3e8a04b1314ff4 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd570febc3e8a04b1314ff4 Content-Type: text/plain; charset=ISO-8859-1 Peter, Thanks for the additional insight on this -- think of a CDN that needs to respond to requests, distributed around the globe. Ultimately you would hope that each edge location could respond as quickly as possible (RF=N) but if each of the ring members keep open/active connections to each other, and a request comes in to an edge location that does not contain a copy of the data, does it request the data from the node that does, then cache it (in the case of more requests coming into that edge location with the same request) or does it reply once and forget it, requiring *each* subsequent request to that node to always phone back home to the node that actually contains it? The CDN/edge-server scenario works particularly well to illustrate my goals, if visualizing that helps. Look forward to your thoughts. -R On Mon, Nov 7, 2011 at 8:05 PM, Peter Schuller wrote: > > Given that, the way I've understood this discussion so far is I would > have a > > RF of N (my total node count) but my Consistency Level with all my writes > > will *likely* be QUORUM -- I think that is a good/safe default for me to > use > > as writes aren't the scenario I need to optimize for latency; that being > > said, I also don't want to wait for a ConsistencyLevel of ALL to complete > > before my code continues though. > > Would you agree with this assessment or am I missing the boat on > something? > > Are you *sure* you care about latency to the degree that data being > non-local actually matters to your application? > > Normally you don't set RF=N unless you have particularly special > requirements. The extra latency implied by another network round-trip > is certainly greater than zero, but in many practical situations > outliers and the behavior in case of e.g. node problems is much more > important than an extra millisecond or two on the average request. > Setting RF=N causes a larger data set on each node, in addition to > causing more nodes to be involved in every request. Consider whether > it's a better use of resources to set RF to e.g. 3 instead, and let > the ring grow independently. That is what one normally does. > > -- > / Peter Schuller (@scode, http://worldmodscode.wordpress.com) > --000e0cd570febc3e8a04b1314ff4 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Peter,

Thanks for the additional insight on this -- thin= k of a CDN that needs to respond to requests, distributed around the globe.= Ultimately you would hope that each edge location could respond as quickly= as possible (RF=3DN) but if each of the ring members keep open/active conn= ections to each other, and a request comes in to an edge location that does= not contain a copy of the data, does it request the data from the node tha= t does, then cache it (in the case of more requests coming into that edge l= ocation with the same request) or does it reply once and forget it, requiri= ng *each* subsequent request to that node to always phone back home to the = node that actually contains it?

The CDN/edge-server scenario works particularly well to= illustrate my goals, if visualizing that helps.

L= ook forward to your thoughts.

-R

On Mon, Nov 7, 2011 at 8:05 PM, Peter Schuller <= span dir=3D"ltr"><peter.s= chuller@infidyne.com> wrote:
> Given that, the way I've understood this discuss= ion so far is I would have a
> RF of N (my total node count) but my Consistency Level with all my wri= tes
> will *likely* be QUORUM -- I think that is a good/safe default for me = to use
> as writes aren't the scenario I need to optimize for latency; that= being
> said, I also don't want to wait for a ConsistencyLevel of ALL to c= omplete
> before my code continues though.
> Would you agree with this assessment or am I missing the boat on somet= hing?

Are you *sure* you care about latency to the degree that data being non-local actually matters to your application?

Normally you don't set RF=3DN unless you have particularly special
requirements. The extra latency implied by another network round-trip
is certainly greater than zero, but in many practical situations
outliers and the behavior in case of e.g. node problems is much more
important than an extra millisecond or two on the average request.
Setting RF=3DN causes a larger data set on each node, in addition to
causing more nodes to be involved in every request. Consider whether
it's a better use of resources to set RF to e.g. 3 instead, and let
the ring grow independently. That is what one normally does.

--
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

--000e0cd570febc3e8a04b1314ff4--