Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of jbellis@gmail.com designates
 74.125.83.42 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=G8BYIOMFNs/GI7OXMNXsTFy4enfs6kHoZ/liIiEbGzYllbdqvyE54JRcA/Dnor5vs/
         WIRjeAfiHtJT01aEh8loM41VIXUxuGnWcYfxtSUuu7SzLNb3yxRCIjSVNa4hmZ6tizaJ
         tKYFqFOWlk2LUUaSVH3qA9/GsQaWWqr9AqinU=
MIME-Version: 1.0
In-Reply-To: <AANLkTimVb6nuMvP+i2G5eX2ecxR1vFi2n9nxaM31kT0F@mail.gmail.com>
References: <AANLkTimVb6nuMvP+i2G5eX2ecxR1vFi2n9nxaM31kT0F@mail.gmail.com>
Date: Fri, 3 Dec 2010 23:03:23 -0600
Message-ID: <AANLkTim++KpkNx+9yRJYr=K0W8s0Ry+uo-HsuKMTaHJf@mail.gmail.com>
Subject: Re: Confused about consistency
From: Jonathan Ellis <jbellis@gmail.com>
To: user <user@cassandra.apache.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I think you are running into
https://issues.apache.org/jira/browse/CASSANDRA-1316, where when an
inconsistency on QUORUM/ALL is discovered it always peformed the
repair at QUORUM instead of the original CL.  Thus, reading at ALL you
would see the correct answer on the 2nd read but you weren't
guaranteed to see it on the first.

This was fixed in 0.6.4 but apparently I botched the merge to the 0.7
branch.  I corrected that just now, so when you update, you should be
good to go.

On Fri, Dec 3, 2010 at 9:19 PM, Dan Hendry <dan.hendry.junk@gmail.com> wrot=
e:
> I am seeing fairly strange, behavior in my Cassandra cluster.
> Setup
> =A0- 3 nodes (lets call them nodes 1 2 and 3)
> =A0- RF=3D2
> =A0- A set of servers=A0(producers)=A0which which write data to the clust=
er at
> consistency level ONE
> =A0- A set of servers (consumers/processors) which read data from the clu=
ster
> at consistency level ALL
> =A0- Cassandra 0.7 (recent out of the svn branch, post beta 3)
> =A0- Clients use the pelops library
> Situation:
> =A0- Everything is humming along nicely
> =A0- A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM erro=
rs
> are the bain of my existence)
> =A0- Producers continue to happily write to the cluster but consumers sta=
rt
> complaining by throwing TimeOutExceptions and UnavailableExceptions.
> =A0- I stagger out of bed in the middle of the night and restart Cassandr=
a on
> node 3.
> =A0- The consumers stop complaining and get back to business but generate
> garbage data for the period node 3 was down. Its almost like half the dat=
a
> is missing half the time. (Again, I am reading at consistency level ALL).
> =A0- I force the consumers to reprocess data for the period node 3 was do=
wn.
> They generate accurate output which is different from the first time roun=
d.
> To be explicit, what seems to be happening is first read at consistency A=
LL
> gives "A,C,E" (for example) and the second read at consistency level ALL
> gives "A,B,C,D,E". Is this a Cassandra bug? Is my knowledge of consistenc=
y
> levels flawed? My understanding is that you could achieve strongly
> consistent behavior by writing at ONE and reading at ALL.
> After this experience, my theory (uneducated, untested, and
> under-researched) is that "strong consistency" applies only to column
> values, not the set of columns (or super-columns in this case) which make=
 up
> a row. Any thoughts?


--=20
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com