Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 236C4DF80 for ; Thu, 9 Aug 2012 01:59:17 +0000 (UTC) Received: (qmail 92158 invoked by uid 500); 9 Aug 2012 01:59:13 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 92127 invoked by uid 500); 9 Aug 2012 01:59:13 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 92118 invoked by uid 99); 9 Aug 2012 01:59:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Aug 2012 01:59:13 +0000 X-ASF-Spam-Status: No, hits=0.6 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.149.140] (HELO na3sys009aog120.obsmtp.com) (74.125.149.140) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 09 Aug 2012 01:59:04 +0000 Received: from mail-vc0-f180.google.com ([209.85.220.180]) (using TLSv1) by na3sys009aob120.postini.com ([74.125.148.12]) with SMTP ID DSNKUCMZUX6bYC7s8vF6cc2G3OFazt4c91iM@postini.com; Wed, 08 Aug 2012 18:58:43 PDT Received: by vcbfw7 with SMTP id fw7so1292395vcb.25 for ; Wed, 08 Aug 2012 18:58:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:date:message-id:subject:from:to:cc:content-type :x-gm-message-state; bh=VbWuZ17qPuxh8tQ8GRhgFtpgQBMzCj3OBrtIHmV2KpE=; b=Vm52NyyvTdCKIU5fhDm17vikDiaz1y0eceOfwlZFxMykt+7U+OZsDloPMjYR/p0sIY xfDSVLtP7HYIBnLB/oI7NBG2uXLd9gs5zsIASgXNSfnGoEQHeWSHghz0Lb9aw0EgHRbR 9gUGQuBFhH4wqKhy8EhtOA3ns8cuGvB4NqXUksh1tP0BLElBeEquPEoO13pr0zSD6Wlz R5i9d/x/gju4Fr4CEc/Y/kZSivo/uGWTKkTfNTkE1L3tsPNHztcoACOihDsE+1HnpES4 XxOAyQKJRMEYAUBPAoHuv/9yaMcqsaHveA11lJBhUxjbulijdNBdvSgGzq0UA/APMvis 6gzg== MIME-Version: 1.0 Received: by 10.220.222.20 with SMTP id ie20mr15758708vcb.13.1344477520770; Wed, 08 Aug 2012 18:58:40 -0700 (PDT) Received: by 10.58.143.111 with HTTP; Wed, 8 Aug 2012 18:58:40 -0700 (PDT) Date: Thu, 9 Aug 2012 11:58:40 +1000 Message-ID: Subject: Syncing nodes + Cassandra Data Availability From: Ben Kaehne To: user@cassandra.apache.org Cc: Franc Carter , David Nelson Content-Type: multipart/alternative; boundary=14dae9cdcc7108b54004c6cb9134 X-Gm-Message-State: ALoCoQmGIYVjLVut1RVQlKgHUi1UiV/ZlL+9D4fhw3NeR0zZx5UR1Wj/O7seBkosrXxxbnpbQxGH --14dae9cdcc7108b54004c6cb9134 Content-Type: text/plain; charset=ISO-8859-1 Good morning, Our application runs on a 3 node cassandra cluster with RF of 3. We use quorum operations against this cluster in hopes of garunteeing consistency. One scenario in which an issue can occur here is: Out of our 3 nodes, only 2 are up. We perform a write to say, a new key. The down node is started again, at the same time, a different node is brought offline. At this point. The data we have written above is on one node, but not the other online node. Meaning quorum reads will fail. Surely other people have encountered such issue before. We disabled hinted handoffs originally as to not have to worry about race conditions of disk space on servers filling up due to piling up handoffs. Although perhaps this may somewhat aid the situation (although from what I read, it does not completely remedy the circumstance). If so, how are you dealing with it? >From what I understand a read repair (in which we have set to 1.0) will only be performed on a successful read occurs, in which will not happen here. nodetool repair seems rather slow, is manual and does not suit our situation where data has to be available apon demand. Regards, -- -Ben --14dae9cdcc7108b54004c6cb9134 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Good morning,

Our application runs on a 3 node cassandra cluster wit= h RF of 3.

We use quorum operations against this cluster in hopes of= garunteeing consistency.

One scenario in which an issue can occur h= ere is:
Out of our 3 nodes, only 2 are up.
We perform a write to say, a new key.=
The down node is started again, at the same time, a different node is b= rought offline.
At this point. The data we have written above is on one = node, but not the other online node. Meaning quorum reads will fail.

Surely other people have encountered such issue before.

We disa= bled hinted handoffs originally as to not have to worry about race conditio= ns of disk space on servers filling up due to piling up handoffs. Although = perhaps this may somewhat aid the situation (although from what I read, it = does not completely remedy the circumstance).

If so, how are you dealing with it?
From what I understand a read re= pair (in which we have set to 1.0) will only be performed on a successful r= ead occurs, in which will not happen here.

nodetool repair seems rat= her slow, is manual and does not suit our situation where data has to be av= ailable apon demand.

Regards,

--
-Ben
--14dae9cdcc7108b54004c6cb9134--