From user-return-22090-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Mon Nov 7 16:58:16 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BB84B7B0F for ; Mon, 7 Nov 2011 16:58:16 +0000 (UTC) Received: (qmail 73974 invoked by uid 500); 7 Nov 2011 16:58:14 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 73939 invoked by uid 500); 7 Nov 2011 16:58:14 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 73931 invoked by uid 99); 7 Nov 2011 16:58:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Nov 2011 16:58:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of arodrime@gmail.com designates 209.85.215.172 as permitted sender) Received: from [209.85.215.172] (HELO mail-ey0-f172.google.com) (209.85.215.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Nov 2011 16:58:08 +0000 Received: by eyg24 with SMTP id 24so4139423eyg.31 for ; Mon, 07 Nov 2011 08:57:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=PIRNtE/DvK9m2GDGmDyYpHHnlciqs0Y9iqYAb9ZFog4=; b=aZb/aRAp7quqhVzj/Ei+GgiGoeY06r9oD8LGuY0CDP+j+/hiA1eAmOD/WCVNeMZvJP n2kF+Oe6Va72FI2EnW8wqtuNnw4tiWwLi+uxCg1WyFU3yu3/NE8bwMqSgkSiVP1IJcXJ ieNAyDUNRSkjW8jiAalZdUZiQQxRuOrqaOzq8= Received: by 10.213.35.199 with SMTP id q7mr1097586ebd.5.1320685066448; Mon, 07 Nov 2011 08:57:46 -0800 (PST) MIME-Version: 1.0 Received: by 10.213.35.73 with HTTP; Mon, 7 Nov 2011 08:57:25 -0800 (PST) In-Reply-To: References: From: Alain RODRIGUEZ Date: Mon, 7 Nov 2011 17:57:25 +0100 Message-ID: Subject: Re: Counters and replication factor To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0015174c0e403eefce04b127f467 --0015174c0e403eefce04b127f467 Content-Type: text/plain; charset=ISO-8859-1 I retried it after restarting all the servers. I still have wrong results (I simulated an event 5 times and it was counted 3 times by some counters 4 or 5 times by others. What I meant by "but now every request returns me always the same count value..." will be easier to explain with an example : event 1: counter1.increment counter2.increment counter3.increment . . . event 5: counter1.increment counter2.increment counter3.increment Show results : counter1.getValue = returns 4 counter2.getValue = returns 3 counter3.getValue = returns 5 counter1.getValue = returns 5 counter2.getValue = returns 3 counter3.getValue = returns 5 counter1.getValue = returns 4 counter2.getValue = returns 4 counter3.getValue = returns 5 ... So I've got wrong values, and not always the same ones. In my previous email I tried to tell you by saying "but now every request returns me always the same count value..." that I had all the time the same wrong values, let us say : counter1.getValue = returns 4 counter2.getValue = returns 3 counter3.getValue = returns 5 counter1.getValue = returns 4 counter2.getValue = returns 3 counter3.getValue = returns 5 counter1.getValue = returns 4 counter2.getValue = returns 3 counter3.getValue = returns 5 But that is not true, I still have some "random" wrong values, maybe haven't I query to get counter values often enough to see it last time. Sorry of not being clearer, that is not easy to explain, neither to understand for me. Thanks for help. Alain 2011/11/7 Riyad Kalla > Alain, > > When you tried CL.All was that only after you had made the change of > ReplicationFactor=3 and restarted all the servers? > > If you hadn't restarted the servers with the new RF, I am not sure that > CL.All would have the intended effect. > > Also, I wasn't sure what you meant by "but know every request returns me > always the same count value..." -- didn't want the requests to always > return you the same values? > > Or maybe you are saying that it always returns the same *wrong* value? > Like you do: > > counter.increment (v=1) > counter.increment (v=2) > counter.increment (v=3) > > counter.getValue = returns 7 > counter.getValue = returns 7 > counter.getValue = returns 7 > > or something inconsistent like that? > > On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ wrote: > >> I've tried with CL.All, but it doesn't wotk better. I still have strange >> values (between 4 and 10 events counted instead of 10) but know every >> request returns me always the same count value... >> >> It's very strange. >> >> Any other idea ? >> >> Alain >> >> >> 2011/11/7 Riyad Kalla >> >>> Alain, >>> >>> Try using a CL of 3 or "ALL" and see if that the problem goes away. >>> >>> Your replication factor (as I just learned) dictates how many nodes each >>> piece of data is replicated to; by using a RF of 3 you are saying >>> "replicate all my data to all my nodes" (in this case counters). >>> >>> This doesn't happen immediately, but you can *force* it to happen on >>> write by specifying a CL of "ALL". If you specify "1" then your counter >>> value is written to one member of the ring, then your command returns. >>> >>> If you keep querying you will bounce around your ring, reading the >>> values from the different nodes until a future date at *which point* all >>> the values will likely agree. >>> >>> If you keep all your code you have now exactly the same, just change the >>> code at the end where you read the counter value back, to keep reading the >>> counter value back every second for 60 seconds and see if all the values >>> eventually match up -- they should (as the counter value is replicated to >>> all the nodes and their old values discarded). >>> >>> -R >>> >>> >>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ wrote: >>> >>>> Hi, >>>> >>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values >>>> from counters when doing so... >>>> >>>> I got a CF that contains many counters of some events. When I'm at RF = >>>> 1 and simulate 10 events, they are well counted. >>>> However, when I switch to a RF = 3, my counter show a wrong value that >>>> sometimes change when requested twice (it can return 7, then 5 instead of >>>> 10 all the time). >>>> >>>> I first thought that it was a problem of CL because I seem to remember >>>> that I read once that I had to use CL.One for reads and writes with >>>> counters. So I tried with CL.One, without success... >>>> >>>> What am I doing wrong ? Is that some precaution to take when >>>> replicating counters ? >>>> >>>> Alain >>>> >>> >>> >> > --0015174c0e403eefce04b127f467 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I retried it after restarting all the servers.

I still have wrong results (I simulated an event 5 times and it was count= ed 3 times by some counters 4 or 5 times by others.

What I meant by "but now every request returns me always the same= count value..." will be easier to explain with an example :

event 1:

counter1.increment=A0
counter2.increment
counter3.increment=A0

.
.
.

event 5:

counter1.increment=A0
counter2.increment
counter3.increment=A0

Show results :
counter1.getValue =3D returns 4
counter2.getValue =3D= returns 3
counter3.getValue =3D returns 5

counter1.getValue =3D returns 5
counter2.getValue =3D returns 3
counter3.getValue =3D return= s 5

counter1.getValue =3D returns 4
coun= ter2.getValue =3D returns 4
counter3.getValue =3D returns 5
=

...

So I've got wrong values, and n= ot always the same ones. In my previous email I tried to tell you by saying= "but now every request returns me always the same count value..."= ; that I had all the time the same wrong values, let us say :

counter1.getValue =3D returns 4
counter2.getV= alue =3D returns 3
counter3.getValue =3D returns 5

=
counter1.getValue =3D returns 4
counter2.getValue =3D = returns 3
counter3.getValue =3D returns 5

counter1.getV= alue =3D returns 4
counter2.getValue =3D returns 3
coun= ter3.getValue =3D returns 5

But that is not true, = I still have some "random" wrong values, maybe haven't I quer= y to get counter values often enough to see it last time.

Sorry of not being clearer, that is not easy to explain= , neither to understand for me.

Thanks for help.

Alain


2011/11/7 Riyad Kalla <rkalla@gmail.com>
Ala= in,

When you tried CL.All was that only after you had made the c= hange of ReplicationFactor=3D3 and restarted all the servers?
If you hadn't restarted the servers with the new RF, I am n= ot sure that CL.All would have the intended effect.

Also, I wasn't sure what you meant by "but kno= w every request returns me always the same count value..." -- didn'= ;t want the requests to always return you the same values?

Or maybe you are saying that it always returns the same *wrong* = value? Like you do:

counter.increment (v=3D1)
counter.increment (v=3D2)
counter.increment (v=3D3)

counter.getValue =3D returns 7
counter.getValue = =3D returns 7
counter.getValue =3D returns 7

=
or something inconsistent like that?

On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <arodrime@gmail.com&g= t; wrote:
I've tried with CL.All, but it doesn't wotk better. I still have st= range values (between 4 and 10 events counted instead of 10) but know every= request returns me always the same count value...

It's very strange.

Any other idea ?

Alain


2011/11/7 Riyad Kalla <rkalla@gmail.com>
Alain,

Try using a CL of 3 or "ALL" and see if= that the problem goes away.

Your replication fact= or (as I just learned) dictates how many nodes each piece of data is replic= ated to; by using a RF of 3 you are saying "replicate all my data to a= ll my nodes" (in this case counters).

This doesn't happen immediately, but you can *force= * it to happen on write by specifying a CL of "ALL". If you speci= fy "1" then your counter value is written to one member of the ri= ng, then your command returns.

If you keep querying you will bounce around your ring, = reading the values from the different nodes until a future date at *which p= oint* all the values will likely agree.

If you kee= p all your code you have now exactly the same, just change the code at the = end where you read the counter value back, to keep reading the counter valu= e back every second for 60 seconds and see if all the values eventually mat= ch up -- they should (as the counter value is replicated to all the nodes a= nd their old values discarded).

-R


On Mon, Nov 7, 2011 at 8:= 15 AM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:
Hi,

I trying to switch from a RF =3D 1 to a R= F =3D 3, but I get wrong values from counters when doing so...
I got a CF that contains many counters of some events. When I&= #39;m at RF =3D 1 and simulate 10 events, they are well counted.
However, when I switch to a RF =3D 3, my counter show a wrong value th= at sometimes change when requested twice (it can return 7, then 5 instead o= f 10 all the time).

I first thought that it was a = problem of CL because I seem to remember that I read once that I had to use= CL.One for reads and writes with counters. So I tried with CL.One, without= success...

What am I doing wrong ? Is that some precaution to take= when replicating counters ?

Alain




--0015174c0e403eefce04b127f467--