Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of arodrime@gmail.com designates
 209.85.215.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CABn9xAHVbkirur9+fi5s5dkK7h6_wwO=aS1jNcn6+Ed8KL2vSw@mail.gmail.com>
References: 
 <CA+VSrLr3N4-=UGpwU95KE5D90qQnq_T410dfzP1ztwD1iUhrZA@mail.gmail.com>
 <CABn9xAGftJZsQ6AfqJ=7oaj-ZL51p8C_zorHfGtcNrBJm7mWfw@mail.gmail.com>
 <CA+VSrLp97k_L3umZvWJ5bheMqMSxxL4wiHXdHrB+yXMbNyp0qw@mail.gmail.com>
 <CABn9xAF=6j=348jfEufqGuEKJuEDysQn4dVupVJBF7orBH-HjA@mail.gmail.com>
 <CA+VSrLo7Bem4nFhJsMns4auJp5LBuewbzmb1CBm4VAsUkiS5OA@mail.gmail.com>
 <CABn9xAHVbkirur9+fi5s5dkK7h6_wwO=aS1jNcn6+Ed8KL2vSw@mail.gmail.com>
From: Alain RODRIGUEZ <arodrime@gmail.com>
Date: Tue, 8 Nov 2011 11:28:44 +0100
Message-ID: 
 <CA+VSrLqCF_rY63KAX3_nsG-NDQrWcv3rJpGtNO6vL56s0PiTYw@mail.gmail.com>
Subject: Re: Counters and replication factor
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=0015174c45ee09c13004b136a4a8

--0015174c45ee09c13004b136a4a8
Content-Type: text/plain; charset=ISO-8859-1

Sylvain, here is my ticket, but I guess you already know it since you are
the assignee :) -->https://issues.apache.org/jira/browse/CASSANDRA-3465
Riyad, Thanks for your help.

Alain

2011/11/7 Riyad Kalla <rkalla@gmail.com>

> Alain thank you for all the clarification, I understand exactly what you
> meant now... and as a result am just as confused as you are :)
>
> What version of Cassandra are you using? Can you share the important parts
> of your config? (you double checked that your replication factor is set on
> all 3 to "3"?)
>
> Also out of curiosity, if you keep querying for up to 5 mins (say every 10
> seconds) do counter1, 2 and 3 still show the same wrong values for getValue
> or do the values eventually converge on the correct amounts?
>
> (I assume 5mins is a long enough window to test, maybe I'm wrong and
> another Cassandra dev can correct me here).
>
> -R
>
>
> On Mon, Nov 7, 2011 at 9:57 AM, Alain RODRIGUEZ <arodrime@gmail.com>wrote:
>
>> I retried it after restarting all the servers.
>>
>> I still have wrong results (I simulated an event 5 times and it was
>> counted 3 times by some counters 4 or 5 times by others.
>>
>> What I meant by "but now every request returns me always the same count
>> value..." will be easier to explain with an example :
>>
>> event 1:
>>
>> counter1.increment
>> counter2.increment
>> counter3.increment
>>
>> .
>> .
>> .
>>
>> event 5:
>>
>> counter1.increment
>> counter2.increment
>> counter3.increment
>>
>> Show results :
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> counter1.getValue = returns 5
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 4
>> counter3.getValue = returns 5
>>
>> ...
>>
>> So I've got wrong values, and not always the same ones. In my previous
>> email I tried to tell you by saying "but now every request returns me
>> always the same count value..." that I had all the time the same wrong
>> values, let us say :
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> But that is not true, I still have some "random" wrong values, maybe
>> haven't I query to get counter values often enough to see it last time.
>>
>> Sorry of not being clearer, that is not easy to explain, neither to
>> understand for me.
>>
>> Thanks for help.
>>
>> Alain
>>
>>
>> 2011/11/7 Riyad Kalla <rkalla@gmail.com>
>>
>>> Alain,
>>>
>>> When you tried CL.All was that only after you had made the change of
>>> ReplicationFactor=3 and restarted all the servers?
>>>
>>> If you hadn't restarted the servers with the new RF, I am not sure that
>>> CL.All would have the intended effect.
>>>
>>> Also, I wasn't sure what you meant by "but know every request returns me
>>> always the same count value..." -- didn't want the requests to always
>>> return you the same values?
>>>
>>> Or maybe you are saying that it always returns the same *wrong* value?
>>> Like you do:
>>>
>>> counter.increment (v=1)
>>> counter.increment (v=2)
>>> counter.increment (v=3)
>>>
>>> counter.getValue = returns 7
>>> counter.getValue = returns 7
>>> counter.getValue = returns 7
>>>
>>> or something inconsistent like that?
>>>
>>> On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <arodrime@gmail.com>wrote:
>>>
>>>> I've tried with CL.All, but it doesn't wotk better. I still have
>>>> strange values (between 4 and 10 events counted instead of 10) but know
>>>> every request returns me always the same count value...
>>>>
>>>> It's very strange.
>>>>
>>>> Any other idea ?
>>>>
>>>> Alain
>>>>
>>>>
>>>> 2011/11/7 Riyad Kalla <rkalla@gmail.com>
>>>>
>>>>> Alain,
>>>>>
>>>>> Try using a CL of 3 or "ALL" and see if that the problem goes away.
>>>>>
>>>>> Your replication factor (as I just learned) dictates how many nodes
>>>>> each piece of data is replicated to; by using a RF of 3 you are saying
>>>>> "replicate all my data to all my nodes" (in this case counters).
>>>>>
>>>>> This doesn't happen immediately, but you can *force* it to happen on
>>>>> write by specifying a CL of "ALL". If you specify "1" then your counter
>>>>> value is written to one member of the ring, then your command returns.
>>>>>
>>>>> If you keep querying you will bounce around your ring, reading the
>>>>> values from the different nodes until a future date at *which point* all
>>>>> the values will likely agree.
>>>>>
>>>>> If you keep all your code you have now exactly the same, just change
>>>>> the code at the end where you read the counter value back, to keep reading
>>>>> the counter value back every second for 60 seconds and see if all the
>>>>> values eventually match up -- they should (as the counter value is
>>>>> replicated to all the nodes and their old values discarded).
>>>>>
>>>>> -R
>>>>>
>>>>>
>>>>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <arodrime@gmail.com>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values
>>>>>> from counters when doing so...
>>>>>>
>>>>>> I got a CF that contains many counters of some events. When I'm at RF
>>>>>> = 1 and simulate 10 events, they are well counted.
>>>>>> However, when I switch to a RF = 3, my counter show a wrong value
>>>>>> that sometimes change when requested twice (it can return 7, then 5 instead
>>>>>> of 10 all the time).
>>>>>>
>>>>>> I first thought that it was a problem of CL because I seem to
>>>>>> remember that I read once that I had to use CL.One for reads and writes
>>>>>> with counters. So I tried with CL.One, without success...
>>>>>>
>>>>>> What am I doing wrong ? Is that some precaution to take when
>>>>>> replicating counters ?
>>>>>>
>>>>>> Alain
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

--0015174c45ee09c13004b136a4a8
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Sylvain, here is my ticket, but I guess you already know it since you are t=
he assignee :) --&gt;<a href=3D"https://issues.apache.org/jira/browse/CASSA=
NDRA-3465">https://issues.apache.org/jira/browse/CASSANDRA-3465</a><br>Riya=
d, Thanks for your help.<div>

<br></div><div>Alain<br><br><div class=3D"gmail_quote">2011/11/7 Riyad Kall=
a <span dir=3D"ltr">&lt;<a href=3D"mailto:rkalla@gmail.com">rkalla@gmail.co=
m</a>&gt;</span><br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0=
 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Alain thank you for all the clarification, I understand exactly what you me=
ant now... and as a result am just as confused as you are :)<div><br></div>=
<div>What version of Cassandra are you using? Can you share the important p=
arts of your config? (you double checked that your replication factor is se=
t on all 3 to &quot;3&quot;?)</div>


<div><br></div><div>Also out of curiosity, if you keep querying for up to 5=
 mins (say every 10 seconds) do counter1, 2 and 3 still show the same wrong=
 values for getValue or do the values eventually converge on the correct am=
ounts?</div>


<div><br></div><div>(I assume 5mins is a long enough window to test, maybe =
I&#39;m wrong and another Cassandra dev can correct me here).</div><span cl=
ass=3D"HOEnZb"><font color=3D"#888888"><div><br></div></font></span><div><s=
pan class=3D"HOEnZb"><font color=3D"#888888">-R</font></span><div>

<div class=3D"h5"><br><br><div class=3D"gmail_quote">On Mon, Nov 7, 2011 at=
 9:57 AM, Alain RODRIGUEZ <span dir=3D"ltr">&lt;<a href=3D"mailto:arodrime@=
gmail.com" target=3D"_blank">arodrime@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div>I retried it after restarting all the s=
ervers.</div><div><br></div><div>I still have wrong results (I simulated an=
 event 5 times and it was counted 3 times by some counters 4 or 5 times by =
others.</div>


<div><br></div>

<div>What I meant by &quot;but now every request returns me always the same=
 count value...&quot; will be easier to explain with an example :</div><div=
><br></div><div>event 1:</div><div><br></div><div>counter1.increment=A0</di=
v>


<div>counter2.increment</div><div>counter3.increment=A0</div><div><br></div=
><div>.</div><div>.</div><div>.</div><div><br></div><div>event 5:</div><div=
><br></div><div>counter1.increment=A0</div><div>counter2.increment</div><di=
v>


counter3.increment=A0</div><div><br></div><div>Show results :</div><div><br=
></div><div>counter1.getValue =3D returns 4</div><div>counter2.getValue =3D=
 returns 3</div><div>counter3.getValue =3D returns 5</div><div><br></div><d=
iv>


counter1.getValue =3D returns 5</div>
<div>counter2.getValue =3D returns 3</div><div>counter3.getValue =3D return=
s 5</div><div><br></div><div>counter1.getValue =3D returns 4</div><div>coun=
ter2.getValue =3D returns 4</div><div>counter3.getValue =3D returns 5</div>=
<div>


<br>
</div><div>...</div><div><br></div><div>So I&#39;ve got wrong values, and n=
ot always the same ones. In my previous email I tried to tell you by saying=
 &quot;but now every request returns me always the same count value...&quot=
; that I had all the time the same wrong values, let us say :</div>


<div><br></div><div>counter1.getValue =3D returns 4</div><div>counter2.getV=
alue =3D returns 3</div><div>counter3.getValue =3D returns 5</div><div><br>=
</div><div>counter1.getValue =3D returns 4</div><div>counter2.getValue =3D =
returns 3</div>


<div>counter3.getValue =3D returns 5</div><div><br></div><div>counter1.getV=
alue =3D returns 4</div><div>counter2.getValue =3D returns 3</div><div>coun=
ter3.getValue =3D returns 5</div><div><br></div><div>But that is not true, =
I still have some &quot;random&quot; wrong values, maybe haven&#39;t I quer=
y to get counter values often enough to see it last time.</div>


<div><br></div><div>Sorry of not being clearer, that is not easy to explain=
, neither to understand for me.</div><div><br></div><div>Thanks for help.</=
div><div><div><div><br></div><div>Alain</div>

<div><div><br></div><div><br><div class=3D"gmail_quote">

2011/11/7 Riyad Kalla <span dir=3D"ltr">&lt;<a href=3D"mailto:rkalla@gmail.=
com" target=3D"_blank">rkalla@gmail.com</a>&gt;</span><br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">


Alain,<div>

<br></div><div>When you tried CL.All was that only after you had made the c=
hange of ReplicationFactor=3D3 and restarted all the servers?</div><div><br=
></div><div>If you hadn&#39;t restarted the servers with the new RF, I am n=
ot sure that CL.All would have the intended effect.</div>


<div><br></div><div>Also, I wasn&#39;t sure what you meant by &quot;but kno=
w every request returns me always the same count value...&quot; -- didn&#39=
;t want the requests to always return you the same values?</div><div><br>


</div><div>Or maybe you are saying that it always returns the same *wrong* =
value? Like you do:</div><div><br></div><div>counter.increment (v=3D1)</div=
><div>counter.increment (v=3D2)</div><div>counter.increment (v=3D3)</div><d=
iv>


<br></div><div>counter.getValue =3D returns 7</div><div>counter.getValue =
=3D returns 7</div><div>counter.getValue =3D returns 7</div><div><br></div>=
<div>or something inconsistent like that?</div><div><div>

<div><br><div class=3D"gmail_quote">

On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <span dir=3D"ltr">&lt;<a hr=
ef=3D"mailto:arodrime@gmail.com" target=3D"_blank">arodrime@gmail.com</a>&g=
t;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0=
 .8ex;border-left:1px #ccc solid;padding-left:1ex">


I&#39;ve tried with CL.All, but it doesn&#39;t wotk better. I still have st=
range values (between 4 and 10 events counted instead of 10) but know every=
 request returns me always the same count value...<div><br></div><div>


It&#39;s very strange.</div>


<div><br></div><div>Any other idea ?</div><span><font color=3D"#888888"><di=
v><br></div></font></span><div><span><font color=3D"#888888">Alain</font></=
span><div><div><br><br><div class=3D"gmail_quote">

2011/11/7 Riyad Kalla <span dir=3D"ltr">&lt;<a href=3D"mailto:rkalla@gmail.=
com" target=3D"_blank">rkalla@gmail.com</a>&gt;</span><br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">


Alain,<div><br></div><div>Try using a CL of 3 or &quot;ALL&quot; and see if=
 that the problem goes away.</div><div><br></div><div>Your replication fact=
or (as I just learned) dictates how many nodes each piece of data is replic=
ated to; by using a RF of 3 you are saying &quot;replicate all my data to a=
ll my nodes&quot; (in this case counters).</div>


<div><br></div><div>This doesn&#39;t happen immediately, but you can *force=
* it to happen on write by specifying a CL of &quot;ALL&quot;. If you speci=
fy &quot;1&quot; then your counter value is written to one member of the ri=
ng, then your command returns.</div>


<div><br></div><div>If you keep querying you will bounce around your ring, =
reading the values from the different nodes until a future date at *which p=
oint* all the values will likely agree.</div><div><br></div><div>If you kee=
p all your code you have now exactly the same, just change the code at the =
end where you read the counter value back, to keep reading the counter valu=
e back every second for 60 seconds and see if all the values eventually mat=
ch up -- they should (as the counter value is replicated to all the nodes a=
nd their old values discarded).</div>


<span><font color=3D"#888888">

<div><br></div></font></span><div><span><font color=3D"#888888">-R</font></=
span><div><div><br><br><div class=3D"gmail_quote">On Mon, Nov 7, 2011 at 8:=
15 AM, Alain RODRIGUEZ <span dir=3D"ltr">&lt;<a href=3D"mailto:arodrime@gma=
il.com" target=3D"_blank">arodrime@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">

<div>Hi,</div><div><br></div><div>I trying to switch from a RF =3D 1 to a R=
F =3D 3, but I get wrong values from counters when doing so...</div><div><b=
r></div><div>I got a CF that contains many counters of some events. When I&=
#39;m at RF =3D 1 and simulate 10 events, they are well counted.</div>


<div>However, when I switch to a RF =3D 3, my counter show a wrong value th=
at sometimes change when requested twice (it can return 7, then 5 instead o=
f 10 all the time).</div><div><br></div><div>I first thought that it was a =
problem of CL because I seem to remember that I read once that I had to use=
 CL.One for reads and writes with counters. So I tried with CL.One, without=
 success...</div>


<div><br></div><div>What am I doing wrong ? Is that some precaution to take=
 when replicating counters ?</div><span><font color=3D"#888888"><div><br></=
div><div>Alain</div>
</font></span></blockquote></div><br></div></div></div>
</blockquote></div><br></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>
</div></div></blockquote></div><br></div></div></div>
</blockquote></div><br></div>

--0015174c45ee09c13004b136a4a8--