Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 910D7195E2 for ; Tue, 5 Apr 2016 22:54:01 +0000 (UTC) Received: (qmail 6742 invoked by uid 500); 5 Apr 2016 22:53:52 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 6701 invoked by uid 500); 5 Apr 2016 22:53:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 6689 invoked by uid 99); 5 Apr 2016 22:53:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Apr 2016 22:53:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 574061A47E0 for ; Tue, 5 Apr 2016 22:53:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.98 X-Spam-Level: * X-Spam-Status: No, score=1.98 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=highwire-org.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id KQcUvGoG_YAN for ; Tue, 5 Apr 2016 22:53:49 +0000 (UTC) Received: from mail-yw0-f182.google.com (mail-yw0-f182.google.com [209.85.161.182]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 2096C5F217 for ; Tue, 5 Apr 2016 22:53:49 +0000 (UTC) Received: by mail-yw0-f182.google.com with SMTP id t10so35339640ywa.0 for ; Tue, 05 Apr 2016 15:53:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=highwire-org.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=weiaTs/WL1UgAxPpWpKTR9Y59mdNmKxdK3zoUp3/IhA=; b=lGfvQUvcozn28GQs8kiaDIiq8x1fRABTiCGcNCPmB99+NC6e9n9b4xt3l+5XFPTwO5 O6lCTHvznyUQFPydAYSO5wZYSenQu1zT5BK67CxaZHt3Trg5miLZ6hIoFot8H5pdEqxz R8V0oOtC+3ksvzmeajVxOizWgrORgsZn3+CYk2H7vI98UVvSV+9gTNIUcblBT1MhMvqO ZMOYHnGtppaJHNpXEAIJUPWIFWgx66J05H8fmV3r1OLvkRRqRv/4K/0fgQBHIP7R5O31 DcYgrWzC8cubjF9RMaPSt/SVBLMl2CL3xDz7qeD1yasKEi+nq9OI/qHW6bUMM3LejpI4 /0Fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=weiaTs/WL1UgAxPpWpKTR9Y59mdNmKxdK3zoUp3/IhA=; b=HDGP7N0olWCBljTl1m+wl2mTczP8Y1CIQummw3+DuObtdRt4xCTvwxJ7si3ZaYCXyX YZGhVgfsntWOsxu7De7HoVOZgqeno8axTRp002vECaMmZ6m3F61PecjOTSU5VbLC54Id 28kEIUPmISKwXSFPvNil/03okwXM7Z+FnLMm2s8zXIvpgRPSu/WTE8j6MBoSJYj46AlB A7xuaIaAX6EH0s0e2wuRuD6LMKBnNqxk0pNIL3M6hggECqEDK7Qzq+r3TXdCX0GhQXcs L2YAvqsanUYDT4RHaV1Rdf+ExuV865SzZ0CNyvhQeWDrgNOjyCYqzSdk987XWfu0LSEp h8DQ== X-Gm-Message-State: AD7BkJJukHPRGSwShTc9UhCIwFCPyoUhnW47V2XgQzD094qnHI7i074w5oK/Jpdpo8KtKc5vJmeYJhYf6ld775I/ MIME-Version: 1.0 X-Received: by 10.13.201.131 with SMTP id l125mr15360597ywd.150.1459896822409; Tue, 05 Apr 2016 15:53:42 -0700 (PDT) Received: by 10.13.229.132 with HTTP; Tue, 5 Apr 2016 15:53:42 -0700 (PDT) In-Reply-To: References: Date: Tue, 5 Apr 2016 15:53:42 -0700 Message-ID: Subject: Re: Is it possible to achieve "sticky" request routing? From: Steve Robenalt To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=94eb2c0545dc82cf01052fc4b7b4 --94eb2c0545dc82cf01052fc4b7b4 Content-Type: text/plain; charset=UTF-8 Hi Micky, The only issues I've seen personally with dropped updates due to even small amounts of time skew were ones I was able to fix by the judicious use of BatchStatements. This was particularly true with counter fields in the 2.0 release (i.e. before counters were fixed), but would also happen with different columns updated in separate statements. I'm not sure what your circumstances are for the lost updates, so I'm not sure if these will help. I'm only pointing it out because it was effective for my cases. Steve On Tue, Apr 5, 2016 at 3:21 PM, Mukil Kesavan wrote: > John and Steve, > > We use QUORUM consistency for both READS and WRITES. So we won't have the > problem of having to pick the right server. The real reason we have this > requirement is because we run in a sometimes overloaded virtualized > environment that causes the servers to have non-trivial time drifts (tens > of milliseconds or a few seconds, even with NTP, which catches up slowly). > This particular client was seeing some updates being dropped silently by > Cassandra when it hit a server with an older local timestamp. They were > relying on server side timestamp generation. > > So they were exploring "sticky" routing as an option since the likelihood > of having monotonically increasing timestamps is higher if the client's > requests always go to a single server. They are aware of the problem of > disconnecting and reconnecting to a new server and have an application > level solution for this. They are also testing using client side timestamps > as well. We're just trying to explore all our options and their pros and > cons. > > Thanks! > > On Tue, Apr 5, 2016 at 1:31 PM, Jonathan Haddad wrote: > >> Yep - Steve hit the nail on the head. The odds of hitting the right >> server with "sticky routing" goes down as your cluster size increases. You >> end up adding extra network hops instead of using token aware routing. >> >> Unless you're trying to do a coordinator tier (and you're not, according >> to your original post), this is a pretty bad idea and I'd advise you to >> push back on that requirement. >> >> On Tue, Apr 5, 2016 at 12:47 PM Steve Robenalt >> wrote: >> >>> Aside from Jon's "why" question, I would point out that this only really >>> works because you are running a 3 node cluster with RF=3. If your cluster >>> is going to grow, you can't guarantee that any one server would have all >>> records. I'd be pretty hesitant to put an invisible constraint like that on >>> a cluster unless you're pretty sure it'll only ever be 3 nodes. >>> >>> On Tue, Apr 5, 2016 at 9:34 AM, Jonathan Haddad >>> wrote: >>> >>>> Why is this a requirement? Honestly I don't know why you would do this. >>>> >>>> >>>> On Sat, Apr 2, 2016 at 8:06 PM Mukil Kesavan >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> We currently have 3 Cassandra servers running in a single datacenter >>>>> with a replication factor of 3 for our keyspace. We also use the >>>>> SimpleSnitch wiith DynamicSnitching enabled by default. Our load balancing >>>>> policy is TokenAwareLoadBalancingPolicy with RoundRobinPolicy as the child. >>>>> This overall configuration results in our client requests spreading equally >>>>> across our 3 servers. >>>>> >>>>> However, we have a new requirement where we need to restrict a >>>>> client's requests to a single server and only go to the other servers on >>>>> failure of the previous server. This particular use case does not have high >>>>> request traffic. >>>>> >>>>> Looking at the documentation the options we have seem to be: >>>>> >>>>> 1. Play with the snitching (e.g. place each server into its own DC or >>>>> Rack) to ensure that requests always go to one server and failover to the >>>>> others if required. I understand that this may also affect replica >>>>> placement and we may need to run nodetool repair. So this is not our most >>>>> preferred option. >>>>> >>>>> 2. Write a new load balancing policy that also uses the >>>>> HostStateListener for tracking host up and down messages, that essentially >>>>> accomplishes "sticky" request routing with failover to other nodes. >>>>> >>>>> Is option 2 the only clean way of accomplishing our requirement? >>>>> >>>>> Thanks, >>>>> Micky >>>>> >>>> >>> >>> >>> -- >>> Steve Robenalt >>> Software Architect >>> srobenalt@highwire.org >>> (office/cell): 916-505-1785 >>> >>> HighWire Press, Inc. >>> 425 Broadway St, Redwood City, CA 94063 >>> www.highwire.org >>> >>> Technology for Scholarly Communication >>> >> > -- Steve Robenalt Software Architect srobenalt@highwire.org (office/cell): 916-505-1785 HighWire Press, Inc. 425 Broadway St, Redwood City, CA 94063 www.highwire.org Technology for Scholarly Communication --94eb2c0545dc82cf01052fc4b7b4 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Micky,

The only issues I've seen= personally with dropped updates due to even small amounts of time skew wer= e ones I was able to fix by the judicious use of BatchStatements. This was = particularly true with counter fields in the 2.0 release (i.e. before count= ers were fixed), but would also happen with different columns updated in se= parate statements. I'm not sure what your circumstances are for the los= t updates, so I'm not sure if these will help. I'm only pointing it= out because it was effective for my cases.

Steve<= /div>


On Tue, Apr 5, 2016 at 3:21 PM, Mukil Kesavan &l= t;weirdbluel= ights@gmail.com> wrote:
John and Steve,

We use QUORUM consistency = for both READS and WRITES. So we won't have the problem of having to pi= ck the right server. The real reason we have this requirement is because we= run in a sometimes overloaded virtualized environment that causes the serv= ers to have non-trivial time drifts (tens of milliseconds or a few seconds,= even with NTP, which catches up slowly). This particular client was seeing= some updates being dropped silently by Cassandra when it hit a server with= an older local timestamp. They were relying on server side timestamp gener= ation.=C2=A0

So they were exploring "sticky&q= uot; routing as an option since the likelihood of having monotonically incr= easing timestamps is higher if the client's requests always go to a sin= gle server. They are aware of the problem of disconnecting and reconnecting= to a new server and have an application level solution for this. They are = also testing using client side timestamps as well. We're just trying to= explore all our options and their pros and cons.

= Thanks!

On Tue, Apr 5, 2016 at 1:31 PM, J= onathan Haddad <jon@jonhaddad.com> wrote:
Yep - Steve hit the nail on the head.=C2= =A0 The odds of hitting the right server with "sticky routing" go= es down as your cluster size increases.=C2=A0 You end up adding extra netwo= rk hops instead of using token aware routing.

Unless you= 're trying to do a coordinator tier (and you're not, according to y= our original post), this is a pretty bad idea and I'd advise you to pus= h back on that requirement.

On Tue, Apr 5, 2016 at 12:47 PM Steve Robenalt <srobenalt@highwir= e.org> wrote:
Aside from Jon's "why" question, I would point out that th= is only really works because you are running a 3 node cluster with RF=3D3. = If your cluster is going to grow, you can't guarantee that any one serv= er would have all records. I'd be pretty hesitant to put an invisible c= onstraint like that on a cluster unless you're pretty sure it'll on= ly ever be 3 nodes.=C2=A0

On Tue, Apr 5, 2016 at 9:34= AM, Jonathan Haddad <jon@jonhaddad.com> wrote:
Why is this a requirement?=C2=A0 Hon= estly I don't know why you would do this.


On Sat, Apr 2, 2016 at 8:06 PM Mukil Kesa= van <weir= dbluelights@gmail.com> wrote:
Hello,

We currently have 3 Cassandra = servers running in a single datacenter with a replication factor of 3 for o= ur keyspace. We also use the SimpleSnitch wiith DynamicSnitching enabled by= default. Our load balancing policy is TokenAwareLoadBalancingPolicy with R= oundRobinPolicy as the child. This overall configuration results in our cli= ent requests spreading equally across our 3 servers.

However, we have a new requirement where we need to restrict a client= 9;s requests to a single server and only go to the other servers on failure= of the previous server. This particular use case does not have high reques= t traffic.

Looking at the documentation the option= s we have seem to be:

1. Play with the snitching (= e.g. place each server into its own DC or Rack) to ensure that requests alw= ays go to one server and failover to the others if required. I understand t= hat this may also affect replica placement and we may need to run nodetool = repair. So this is not our most preferred option.

= 2. Write a new load balancing policy that also uses the HostStateListener f= or tracking host up and down messages, that essentially accomplishes "= sticky" request routing with failover to other nodes.

Is option 2 the only clean way of accomplishing our requirement?

Thanks,
Micky



--
Steve R= obenalt=C2=A0
Software Architect
(office/cell): 916-505-1785

=
HighWire Press, Inc.
425 Broadway St= , Redwood City, CA 94063

Technology for Scholarly Communication




--
=
=
St= eve Robenalt=C2=A0
Software Architect
(office/cell): 916-505-1785=

=
HighWire Press, In= c.
425 Broad= way St, Redwood City, CA 94063
<= br>
Technology for Scholarly Communication
--94eb2c0545dc82cf01052fc4b7b4--