Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 71213 invoked from network); 4 Dec 2010 03:20:25 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 4 Dec 2010 03:20:25 -0000 Received: (qmail 52409 invoked by uid 500); 4 Dec 2010 03:20:23 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 52231 invoked by uid 500); 4 Dec 2010 03:20:22 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 52223 invoked by uid 99); 4 Dec 2010 03:20:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Dec 2010 03:20:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dan.hendry.junk@gmail.com designates 209.85.216.179 as permitted sender) Received: from [209.85.216.179] (HELO mail-qy0-f179.google.com) (209.85.216.179) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Dec 2010 03:20:15 +0000 Received: by qyk11 with SMTP id 11so10841405qyk.10 for ; Fri, 03 Dec 2010 19:19:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=x11ZygwaQ4HJQEipbDaT8vrYI+DmLsfv48DHRCHMQWc=; b=IJeYOv8EQXk6fnJ7ioJ6JOVEsr+Dy38EB2afOOH3ubehXpI7x6nIf5xyN58xmktoFD 4Z24JBH+sEshOEhXGnEa3jTv3V5Hi2E91PDHkADrm+SiybgrM7kgGjm9uIxX6HA+cevZ UNmkFaugiWCWCW1oSVcPEU6mleM+I7iB5gjK0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=b/rEAFLAk4c02ASKE70N+QyUwDV6sFMc9ViN0S/XSBLJyzaQg4xPGqHUh4TFQcB2xz 88ygRHXDtbjwXLD6oijPqDnFjbPHdINPz633ScL8MIDuGM5XSEQ2/wPGjfeT7meZxazT eYzWbq6JIWkx4QGOVED4pWx9PJzeHixHx4Xts= MIME-Version: 1.0 Received: by 10.220.162.18 with SMTP id t18mr514333vcx.239.1291432792217; Fri, 03 Dec 2010 19:19:52 -0800 (PST) Received: by 10.220.188.9 with HTTP; Fri, 3 Dec 2010 19:19:52 -0800 (PST) Date: Fri, 3 Dec 2010 22:19:52 -0500 Message-ID: Subject: Confused about consistency From: Dan Hendry To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001485ea3c74d4d91004968d207b --001485ea3c74d4d91004968d207b Content-Type: text/plain; charset=ISO-8859-1 I am seeing fairly strange, behavior in my Cassandra cluster. Setup - 3 nodes (lets call them nodes 1 2 and 3) - RF=2 - A set of servers (producers) which which write data to the cluster at consistency level ONE - A set of servers (consumers/processors) which read data from the cluster at consistency level ALL - Cassandra 0.7 (recent out of the svn branch, post beta 3) - Clients use the pelops library Situation: - Everything is humming along nicely - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM errors are the bain of my existence) - Producers continue to happily write to the cluster but consumers start complaining by throwing TimeOutExceptions and UnavailableExceptions. - I stagger out of bed in the middle of the night and restart Cassandra on node 3. - The consumers stop complaining and get back to business but generate garbage data for the period node 3 was down. Its almost like half the data is missing half the time. (Again, I am reading at consistency level ALL). - I force the consumers to reprocess data for the period node 3 was down. They generate accurate output which is different from the first time round. To be explicit, what seems to be happening is first read at consistency ALL gives "A,C,E" (for example) and the second read at consistency level ALL gives "A,B,C,D,E". Is this a Cassandra bug? Is my knowledge of consistency levels flawed? My understanding is that you could achieve strongly consistent behavior by writing at ONE and reading at ALL. After this experience, my theory (uneducated, untested, and under-researched) is that "strong consistency" applies only to column values, not the set of columns (or super-columns in this case) which make up a row. Any thoughts? --001485ea3c74d4d91004968d207b Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I am seeing fairly strange, behavior in my Cassandra cluster.

Setup
=A0- 3 nodes (lets call them nodes 1 2 and 3)
=A0- RF=3D2
=A0- A set of servers=A0(producers)=A0which which write data to the cl= uster at consistency level ONE
=A0- A set of servers (consumers/p= rocessors) which read data from the cluster at consistency level ALL
<= meta http-equiv=3D"content-type" content=3D"text/html; charset=3Dutf-8"> =A0- Cassandra 0.7 (recent out of the svn branch, post beta 3)
= =A0- Clients use the pelops library

Situation:
=A0- Everything is humming along nicely
=A0- A Cassandra n= ode (say 3) goes down (even with 24 GB of ram, OOM errors are the bain of m= y existence)
=A0- Producers continue to happily write to the cluster but consumers = start complaining by throwing TimeOutExceptions and UnavailableExceptions.<= /div>
=A0- I stagger out of bed in the middle of the night and restart = Cassandra on node 3.
=A0- The consumers stop complaining and get back to business but gener= ate garbage data for the period node 3 was down. Its almost like half the d= ata is missing half the time. (Again, I am reading at consistency level ALL= ).
=A0- I force the consumers to reprocess data for the period node 3 was= down. They generate accurate output which is different from the first time= round.=A0

To be explicit, what seems to be happen= ing is first read at consistency ALL gives "A,C,E" (for example) = and the second read at consistency level ALL gives "A,B,C,D,E". I= s this a Cassandra bug? Is my knowledge of consistency levels flawed? My un= derstanding is that you could achieve strongly consistent behavior by writi= ng at ONE and reading at ALL.=A0

After this experience, my theory (uneducated, untested,= and under-researched) is that "strong consistency" applies only = to column values, not the set of columns (or super-columns in this case) wh= ich make up a row. Any thoughts?
--001485ea3c74d4d91004968d207b--