Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4438DDE23 for ; Fri, 24 May 2013 16:59:31 +0000 (UTC) Received: (qmail 85955 invoked by uid 500); 24 May 2013 16:59:28 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 85932 invoked by uid 500); 24 May 2013 16:59:28 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 85924 invoked by uid 99); 24 May 2013 16:59:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 May 2013 16:59:28 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,TO_NO_BRKTS_PCNT X-Spam-Check-By: apache.org Received-SPF: unknown (nike.apache.org: error in processing during lookup of kais@neteck-fr.com) Received: from [209.85.219.51] (HELO mail-oa0-f51.google.com) (209.85.219.51) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 May 2013 16:59:21 +0000 Received: by mail-oa0-f51.google.com with SMTP id f4so6427551oah.38 for ; Fri, 24 May 2013 09:58:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=tS7f06ZrMdHVG4FN52v5aCc9RJljCFYw+lsKlACe4s8=; b=Xv2wlpiqXs1cgW0C12aFQ3VBj/2MN+DQ8ofhSfQ5hGMWkqPwcLPUZm5dXefgVx6kn6 rjC8myrxiD8aQDGP08wBI2/28I5Tlt5B71IMd4Q/xC0AQEh80/ybHuzWnpQt63S6GOaA VQtlGWPwOzLpin5pJyCkmKf9Nqg/IG+u8OQyfDB5FK3ZI+7blv/UuG/m2m51B2E2t/dI OQqZTrI/3mIJ8ZBh7nX0btaB3Rwfob3ltjS8N4QQ/FD/cFeWzhMOC0u6b+1SoDF5/A6o pr3daKcgKYj4FP3X93WL5BmSdwJi/osFdubVW6w9v4cIHXIyQsYfjff/0vLQh2U/tlil gSGg== MIME-Version: 1.0 X-Received: by 10.182.112.133 with SMTP id iq5mr12287686obb.75.1369414739261; Fri, 24 May 2013 09:58:59 -0700 (PDT) Received: by 10.182.98.233 with HTTP; Fri, 24 May 2013 09:58:59 -0700 (PDT) In-Reply-To: References: <833ED583-13CF-45AE-8545-CCF48AD2CB68@thelastpickle.com> Date: Fri, 24 May 2013 18:58:59 +0200 Message-ID: Subject: Re: Cassandra read reapair From: Kais Ahmed To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=089e0149cfe015917404dd79b76e X-Gm-Message-State: ALoCoQlz08acS3W50t3QDvX+0e5nwFs9MNZ8R29K4Koq4eYkiwwdoWPEGWEH/TQ6Hu8J7VEUbMFP X-Virus-Checked: Checked by ClamAV on apache.org --089e0149cfe015917404dd79b76e Content-Type: text/plain; charset=ISO-8859-1 Hi aaron an thanks, > If you are reading and writing at CL QUOURM and getting inconsistent results that sounds like a bug. If you are mixing the CL levels such that R + W <= N then it's expected behaviour. I think it's a bug, it concern only some keys (~200 over 120 000 keys) on one column family, As I understand it, if R+W <= N it's expected behaviour at a moment, but read repair will correct the inconsistent data (related to read_repair_chance value) and the next query will return a consistent data , right ? Here is an exemple of a result that i have on a key (keyspace RF 3), 3 differents replicas : [default@prod] ASSUME contacts_timeordered KEYS AS ascii; [default@prod] get contacts_timeordered['1425185-IGNORED']; => (column=1a927740-97ec-11e2-ab16-a1afd66a735e, value=363838353936, timestamp=1364505108842098) => (column=1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=31373930303330, timestamp=1364505108851088) => (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) => (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) => (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) => (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) => (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) => (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) => (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 9 results. Elapsed time: 10 msec(s). [default@prod] get contacts_timeordered['1425185-IGNORED']; => (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) => (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) => (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) => (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) => (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) => (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) => (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 7 results. Elapsed time: 7.49 msec(s). [default@prod] get contacts_timeordered['1425185-IGNORED']; => (column=1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=31373930303330, timestamp=1364505108851088) => (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) => (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) => (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) => (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) => (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) => (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) => (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 8 results. Elapsed time: 9.37 msec(s). Do I have to change read_repair_chance to 1 to correct the inconsistency, nodetool repair don't solve it. Thanks a lot, 2013/5/23 aaron morton > If you are reading and writing at CL QUOURM and getting inconsistent > results that sounds like a bug. If you are mixing the CL levels such that R > + W <= N then it's expected behaviour. > > > Can you reproduce the issue outside of your app ? > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 21/05/2013, at 8:55 PM, Kais Ahmed wrote: > > > Checking you do not mean the row key is corrupt and cannot be read. > Yes, i can read it but all read don't return the same result except for > CL ALL > > > By default in 1.X and beyond the default read repair chance is 0.1, so > it's only enabled on 10% of requests. > You are right read repair chance is set to 0.1, but i launched a read > repair which did not solved the problem. Any idea? > > >What CL are you writing at ? > All write are in CL QUORUM > > thank you aaron for your answer. > > > 2013/5/21 aaron morton > >> Only some keys of one CF are corrupt. >> >> Checking you do not mean the row key is corrupt and cannot be read. >> >> I thought using CF ALL, would correct the problem with READ REPAIR, but by >> returning to CL QUORUM, the problem persists. >> >> By default in 1.X and beyond the default read repair chance is 0.1, so >> it's only enabled on 10% of requests. >> >> In the absence of further writes all reads (at any CL) should return the >> same value. >> >> What CL are you writing at ? >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 19/05/2013, at 1:28 AM, Kais Ahmed wrote: >> >> Hi all, >> >> I encountered a consistency problem one some keys using phpcassa and >> Cassandra 1.2.3 since a server crash >> >> Only some keys of one CF are corrupt. >> >> I lauched a nodetool repair that successfully completed but don't correct >> the issue. >> >> >> >> When i try to get a corrupt Key with : >> >> CL ONE, the result contains 7 or 8 or 9 columns >> >> CL QUORUM, result contains 8 or 9 columns >> >> CL ALL, the data is consistent and returns always 9 columns >> >> >> I thought using CF ALL, would correct the problem with READ REPAIR, but by >> returning to CL QUORUM, the problem persists. >> >> >> Thank you for your help >> >> >> >> >> >> >> >> >> >> >> > > --089e0149cfe015917404dd79b76e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi aaron an thanks,

> I= f you are reading and writing at CL QUOURM and getting inconsistent=20 results that sounds like a bug. If you are mixing the CL levels such=20 that R + W <=3D N then it's expected behaviour.
I thi= nk it's a bug, it concern only some keys (~200 over 120 000 keys) on on= e column family,

As I understand = it, if=A0 R+W <=3D N it's expected behaviour at a moment, but= read repair will correct the inconsistent data (related to read_repair_cha= nce value)
and the next query will return a consistent data , right ?

Here is an exemple of a result that i have on a key (keyspace RF 3), 3 d= ifferents replicas :

[default@prod] ASSUME contacts_timeordered KEYS= AS ascii;

[default@prod] get contacts_timeordered['1425185-IGNORED'];
= =3D> (column=3D1a927740-97ec-11e2-ab16-a1afd66a735e, value=3D36383835393= 6, timestamp=3D1364505108842098)
=3D> (column=3D1a93d5b0-97ec-11e2-88= 8c-2bf068e0f754, value=3D31373930303330, timestamp=3D1364505108851088)
=3D> (column=3Db5c559c0-9d0f-11e2-8682-f7ecd4112689, value=3D32343130303= 930, timestamp=3D1365070157421869)
=3D> (column=3D7ba22b90-b48b-11e2-= a2c2-914573921d9f, value=3D32353031353039, timestamp=3D1367652194221857)=3D> (column=3D63ef5d80-b7e8-11e2-abf8-593c289227cd, value=3D3238343532= 3830, timestamp=3D1368021951146575)
=3D> (column=3Dd6383fc0-b810-11e2-a880-bd2ecacbaee3, value=3D31363334363= 737, timestamp=3D1368039322753824)
=3D> (column=3Df47d8e60-bd3f-11e2-= 88f4-533a93fe9432, value=3D32373938313038, timestamp=3D1368609315699785)=3D> (column=3Dc5bfe060-bf8e-11e2-ab1f-07be407aff58, value=3D3233363435= 3034, timestamp=3D1368863069848610)
=3D> (column=3Df07ae4b0-c42f-11e2-8064-9794e872eb2b, value=3D36383835393= 6, timestamp=3D1369372095163129)
Returned 9 results.
Elapsed time: 10= msec(s).

[default@prod] get contacts_timeordered['1425185-IGNOR= ED'];
=3D> (column=3Db5c559c0-9d0f-11e2-8682-f7ecd4112689, value=3D32343130303= 930, timestamp=3D1365070157421869)
=3D> (column=3D7ba22b90-b48b-11e2-= a2c2-914573921d9f, value=3D32353031353039, timestamp=3D1367652194221857)=3D> (column=3D63ef5d80-b7e8-11e2-abf8-593c289227cd, value=3D3238343532= 3830, timestamp=3D1368021951146575)
=3D> (column=3Dd6383fc0-b810-11e2-a880-bd2ecacbaee3, value=3D31363334363= 737, timestamp=3D1368039322753824)
=3D> (column=3Df47d8e60-bd3f-11e2-= 88f4-533a93fe9432, value=3D32373938313038, timestamp=3D1368609315699785)=3D> (column=3Dc5bfe060-bf8e-11e2-ab1f-07be407aff58, value=3D3233363435= 3034, timestamp=3D1368863069848610)
=3D> (column=3Df07ae4b0-c42f-11e2-8064-9794e872eb2b, value=3D36383835393= 6, timestamp=3D1369372095163129)
Returned 7 results.
Elapsed time: 7.= 49 msec(s).

[default@prod] get contacts_timeordered['1425185-IGN= ORED'];
=3D> (column=3D1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=3D31373930303= 330, timestamp=3D1364505108851088)
=3D> (column=3Db5c559c0-9d0f-11e2-= 8682-f7ecd4112689, value=3D32343130303930, timestamp=3D1365070157421869)=3D> (column=3D7ba22b90-b48b-11e2-a2c2-914573921d9f, value=3D3235303135= 3039, timestamp=3D1367652194221857)
=3D> (column=3D63ef5d80-b7e8-11e2-abf8-593c289227cd, value=3D32383435323= 830, timestamp=3D1368021951146575)
=3D> (column=3Dd6383fc0-b810-11e2-= a880-bd2ecacbaee3, value=3D31363334363737, timestamp=3D1368039322753824)=3D> (column=3Df47d8e60-bd3f-11e2-88f4-533a93fe9432, value=3D3237393831= 3038, timestamp=3D1368609315699785)
=3D> (column=3Dc5bfe060-bf8e-11e2-ab1f-07be407aff58, value=3D32333634353= 034, timestamp=3D1368863069848610)
=3D> (column=3Df07ae4b0-c42f-11e2-= 8064-9794e872eb2b, value=3D363838353936, timestamp=3D1369372095163129)
R= eturned 8 results.
Elapsed time: 9.37 msec(s).

Do I = have to change read_repair_= chance to 1 to correct the<= /span> inconsistency, nodetool repair don't sol= ve it.

Thanks a lot,




2013/5/23 aaron morton <aaron@thelastpickle.com>
If you are reading and writing at CL QU= OURM and getting inconsistent results that sounds like a bug. If you are mi= xing the CL levels such that R + W <=3D N then it's expected behavio= ur.=A0


Can you reproduce the issue outside of y= our app ?=A0

Cheers
-----------------
Aaron Morton
Freelance Cassandra= Consultant
New Zealand


On 21/05/2013, at 8:55 PM, Kais = Ahmed <kais@nete= ck-fr.com> wrote:

> Checking you do not mean the row key is corrupt and cannot be read.
Yes, i can read it but all read don't return the same result except for CL ALL

>
By default in 1.X and beyond the default = read repair chance is 0.1, so it's only enabled on 10% of requests. You are right <= span lang=3D"en">read repair chance is set to 0.1, but i launched a read re= pair which did not solved the problem. Any idea?

>What CL are you writing at ?=
All write are in CL QUORUM

tha= nk you aaron for your answer.


2013/5/21 aaron morton <= ;aaron@thelast= pickle.com>
Only some keys of one CF are corrupt.=A0
Checking you do not mean the row key is corrupt and cannot be read.=A0

I thought=A0using=A0CF=A0ALL,=A0would=A0correct the=A0pro= blem=A0with=A0READ<= /span>=A0REPAIR,=A0but=A0by returnin= g=A0to=A0CL=A0QUORUM,=A0the problem=A0persists.

By default in 1.X and beyond the default read repair chance is 0.= 1, so it's only enabled on 10% of requests.=A0


In the absence of further writes all reads (at any CL) shoul= d return the same value.=A0

What CL are you writin= g at ?=A0

Cheers

-----------------
Aaron Morton
Freelance Cassandra= Consultant
New Zealand


On 19/05/2013, at 1:28 AM, Kais Ahmed <kais@neteck-fr.com> wrote:
Hi all,

I encountere= d a consistency problem one some keys using phpcassa and Cassandra 1.2.3 si= nce a server crash

Only some keys of one CF are corrupt.

I lauched a no= detool repair that successfully completed but don't correct the issue.<= br>



When i try to get a corrupt Key with :

CL ONE, the result contains 7 or 8 or 9 columns

CL QUORUM, re= sult contains 8 or 9 columns

CL ALL, the data is consistent and retur= ns always 9 columns


I thought using <= span>CF ALL, would correct the problem with READ REPAIR, = but by returning to CL = QUORUM, the problem persists.=


Thank you for your h= elp












--089e0149cfe015917404dd79b76e--