Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
From: aaron morton <aaron@thelastpickle.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_27C4FA32-2CDD-4698-A85A-221410A2EDB2"
Message-Id: <D8F0C930-751C-457B-9075-979732852D1B@thelastpickle.com>
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: Multiple counters value after restart
Date: Thu, 1 Nov 2012 21:41:01 +1300
References: 
 <CA+VSrLrHW7_Pg9_mu4Q2JXwhGfa-+GFAtfm__OoUR8azx-OqmQ@mail.gmail.com>
 <CAHO4ity7grEMDgWp83UfEDBxMGyPdEZmu3LGQ3ksGE8VrBWTVQ@mail.gmail.com>
 <7AF531E1-50B2-403D-848F-BE4241411E45@thelastpickle.com>
 <CA+VSrLrfRO-nu77Fy2tCM7R9LxkRWannUy4egw2ovAmcniCcoA@mail.gmail.com>
To: user@cassandra.apache.org
In-Reply-To: 
 <CA+VSrLrfRO-nu77Fy2tCM7R9LxkRWannUy4egw2ovAmcniCcoA@mail.gmail.com>


--Apple-Mail=_27C4FA32-2CDD-4698-A85A-221410A2EDB2
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

> "What CL are you using ?"
>=20
> I think this can be what causes the issue. I'm writing and reading at =
CL ONE. I didn't drain before stopping Cassandra and this may have =
produce a fail in the current counters (those which were being written =
when I stopped a server).
My first thought is to use QUOURM. But with only two nodes it's hard to =
get strong consistency using  QUOURM. =20
Can you try it thought, or run a repair ?=20

> But isn't Cassandra suppose to handle a server crash ? When a server =
crashes I guess it don't drain before...

I was asking to understand how you did the upgrade.=20

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/11/2012, at 11:39 AM, Alain RODRIGUEZ <arodrime@gmail.com> wrote:

> "What version of cassandra are you using ?"
>=20
> 1.1.2
>=20
> "Can you explain this further?"
>=20
> I had an unexplained amount of reads (up to 1800 r/s and 90 Mo/s) on =
one server the other was doing about 200 r/s and 5 Mo/s max. I fixed it =
by rebooting the server. This server is dedicated to cassandra. I can't =
tell you more about it 'cause I don't get it... But a simple Cassandra =
restart wasn't enough.
>=20
> "Was something writing to the cluster ?"
>=20
> Yes we are having some activity and perform about 600 w/s.
>=20
> "Did you drain for the upgrade ?"
>=20
> We upgrade a long time ago and to 1.1.2. This warning is about the =
version 1.1.6.
>=20
> "What changes did you make ?"
>=20
> In the cassandra.yaml I just change the =
"compaction_throughput_mb_per_sec" property to slow down my compaction a =
bit. I don't think the problem come from here.
>=20
> "Are you saying that a particular counter column is giving different =
values for different reads ?"
>=20
> Yes, this is exactly what I was saying. Sorry if something is wrong =
with my English, it's not my mother tongue.
>=20
> "What CL are you using ?"
>=20
> I think this can be what causes the issue. I'm writing and reading at =
CL ONE. I didn't drain before stopping Cassandra and this may have =
produce a fail in the current counters (those which were being written =
when I stopped a server).
>=20
> But isn't Cassandra suppose to handle a server crash ? When a server =
crashes I guess it don't drain before...
>=20
> Thank you for your time Aaron, once again.
>=20
> Alain
>=20
>=20
>=20
> 2012/10/31 aaron morton <aaron@thelastpickle.com>
> What version of cassandra are you using ?
>=20
>>  I finally restart Cassandra. It didn't solve the problem so I =
stopped Cassandra again on that node and restart my ec2 server. This =
solved the issue (1800 r/s to 100 r/s).
> Can you explain this further?
> Was something writing to the cluster ?
> Did you drain for the upgrade ? =
https://github.com/apache/cassandra/blob/cassandra-1.1/NEWS.txt#L17
>=20
>> Today I changed my cassandra.yml and restart this same server to =
apply my conf.
>=20
> What changes did you make ?
>=20
>> I just noticed that my homepage (which uses a Cassandra counter and =
refreshes every sec) shows me 4 different values. 2 of them repeatedly =
(5000 and 4000) and the 2 other some rare times (5500 and 3800)
> Are you saying that a particular counter column is giving different =
values for different reads ?=20
> What CL are you using ?
>=20
> Cheers
>=20
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>=20
> On 31/10/2012, at 3:39 AM, Jason Wee <peichieh@gmail.com> wrote:
>=20
>> maybe enable the debug in log4j-server.properties and going through =
the log to see what actually happen?
>>=20
>> On Tue, Oct 30, 2012 at 7:31 PM, Alain RODRIGUEZ <arodrime@gmail.com> =
wrote:
>> Hi,=20
>>=20
>> I have an issue with counters, yesterday I had a lot of =
ununderstandable reads/sec on one server. I finally restart Cassandra. =
It didn't solve the problem so I stopped Cassandra again on that node =
and restart my ec2 server. This solved the issue (1800 r/s to 100 r/s).
>>=20
>> Today I changed my cassandra.yml and restart this same server to =
apply my conf.
>>=20
>> I just noticed that my homepage (which uses a Cassandra counter and =
refreshes every sec) shows me 4 different values. 2 of them repeatedly =
(5000 and 4000) and the 2 other some rare times (5500 and 3800)
>>=20
>> Only the counters made today and yesterday are concerned.
>>=20
>> I performed a repair without success. These data are the heart of our =
business so if someone had any clue on it, I would be really grateful...
>>=20
>> The sooner the better, I am in production with these random counters.
>>=20
>> Alain
>>=20
>> INFO:
>>=20
>> My environnement is 2 nodes (EC2 large), RF 2, CL.ONE (R & W), Random =
Partitioner.
>>=20
>> xxx.xxx.xxx.241    eu-west     1b          Up     Normal  151.95 GB   =
    50.00%              0
>> xxx.xxx.xxx.109    eu-west     1b          Up     Normal  117.71 GB   =
    50.00%              85070591730234615865843651857942052864
>>=20
>> Here is my conf: http://pastebin.com/5cMuBKDt
>>=20
>>=20
>>=20
>=20
>=20


--Apple-Mail=_27C4FA32-2CDD-4698-A85A-221410A2EDB2
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Diso-8859-1"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><div></div><blockquote type=3D"cite"><div><span style=3D"font-family: =
arial, sans-serif; font-size: 13px; ">"What CL are you using =
?"</span><span style=3D"font-family: arial, sans-serif; font-size: 13px; =
"><br></span></div><div><span style=3D"font-family: arial, sans-serif; =
font-size: 13px; "><br></span></div><div><span style=3D"font-family: =
arial, sans-serif; font-size: 13px; ">I think this can be what causes =
the issue. I'm writing and reading at CL ONE. I didn't drain before =
stopping Cassandra and this may have produce a fail in the current =
counters (those which were being written when I stopped a =
server).</span></div></blockquote><div><font face=3D"arial, =
sans-serif"><span style=3D"font-size: 13px;">My first thought is to use =
QUOURM. But with only two nodes it's hard to get strong consistency =
using &nbsp;QUOURM. &nbsp;</span></font></div><div><font face=3D"arial, =
sans-serif"><span style=3D"font-size: 13px;">Can you try it thought, or =
run a repair ?&nbsp;</span></font></div><div><br></div><div><blockquote =
type=3D"cite"><span style=3D"font-family: arial, sans-serif; font-size: =
13px; ">But isn't Cassandra suppose to handle a server crash ? When a =
server crashes I guess it don't drain =
before...</span></blockquote></div><div><font face=3D"arial, =
sans-serif"><span style=3D"font-size: 13px;">I was asking to understand =
how you did the upgrade.&nbsp;</span></font></div><div><font =
face=3D"arial, sans-serif"><span style=3D"font-size: =
13px;"><br></span></font></div><div><font face=3D"arial, =
sans-serif"><span style=3D"font-size: =
13px;">Cheers</span></font></div><div><font face=3D"arial, =
sans-serif"><span style=3D"font-size: =
13px;"><br></span></font></div><div apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></div></span></span>
</div>

<br><div><div>On 1/11/2012, at 11:39 AM, Alain RODRIGUEZ &lt;<a =
href=3D"mailto:arodrime@gmail.com">arodrime@gmail.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><span =
style=3D"font-family:arial,sans-serif;font-size:13px">"What version of =
cassandra are you using ?"</span><br><div><br></div><div><font =
face=3D"arial, sans-serif">1.1.2</font></div><div><font face=3D"arial, =
sans-serif"><br>

</font></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">"Can you explain =
this further?"</span><font face=3D"arial, =
sans-serif"><br></font></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">I had an =
unexplained amount of reads (up to 1800 r/s and 90 Mo/s) on one server =
the other was doing about 200 r/s and 5 Mo/s max. I fixed it by =
rebooting the server. This server is dedicated to cassandra. I can't =
tell you more about it 'cause I don't get it... But a simple Cassandra =
restart wasn't enough.</span></div>

<div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br></span></div><di=
v><span =
style=3D"font-family:arial,sans-serif;font-size:13px">"</span><span =
style=3D"font-family:arial,sans-serif;font-size:13px">Was something =
writing to the cluster ?"</span></div>

<div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br></span></div><di=
v><span style=3D"font-family:arial,sans-serif;font-size:13px">Yes we are =
having some activity and perform about 600 w/s.</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">"Did you drain for =
the upgrade ?"</span><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br></span></div><di=
v><span style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">We upgrade a long =
time ago and to 1.1.2. This warning is about the version =
1.1.6.</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">"</span><span =
style=3D"font-family:arial,sans-serif;font-size:13px">What changes did =
you make ?"</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">In the =
cassandra.yaml I just change the =
"</span>compaction_throughput_mb_per_sec" property to slow down my =
compaction a bit. I don't think the problem come from here.</div>

<div><br></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">"Are you saying =
that a particular counter column is giving different values for =
different reads ?"</span><br></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">Yes, this is =
exactly what I was saying. Sorry if something is wrong with my English, =
it's not my mother tongue.</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">"What CL are you =
using ?"</span><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br></span></div><di=
v><span style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">I think this can =
be what causes the issue. I'm writing and reading at CL ONE. I didn't =
drain before stopping Cassandra and this may have produce a fail in the =
current counters (those which were being written when I stopped a =
server).</span></div>

<div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br></span></div><di=
v><span style=3D"font-family:arial,sans-serif;font-size:13px">But isn't =
Cassandra suppose to handle a server crash ? When a server crashes I =
guess it don't drain before...</span></div>

<div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br></span></div><di=
v><span style=3D"font-family:arial,sans-serif;font-size:13px">Thank you =
for your time Aaron, once again.</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br>

</span></div><div><span =
style=3D"font-family:arial,sans-serif;font-size:13px">Alain</span></div><d=
iv><span =
style=3D"font-family:arial,sans-serif;font-size:13px"><br></span></div><di=
v class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">

2012/10/31 aaron morton <span dir=3D"ltr">&lt;<a =
href=3D"mailto:aaron@thelastpickle.com" =
target=3D"_blank">aaron@thelastpickle.com</a>&gt;</span><br><blockquote =
class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; =
border-left-width: 1px; border-left-color: rgb(204, 204, 204); =
border-left-style: solid; padding-left: 1ex; position: static; z-index: =
auto; ">

<div style=3D"word-wrap:break-word">What version of cassandra are you =
using ?<div><div class=3D"im"><br><blockquote type=3D"cite"><div =
class=3D"gmail_quote"><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex">

&nbsp;I finally restart Cassandra. It didn't solve the problem so I =
stopped Cassandra again on that node and restart my ec2 server. This =
solved the issue (1800 r/s to 100 =
r/s).</blockquote></div></blockquote></div>Can you explain this =
further?<div>

Was something writing to the cluster ?</div><div>Did you drain for the =
upgrade ?&nbsp;<a =
href=3D"https://github.com/apache/cassandra/blob/cassandra-1.1/NEWS.txt#L1=
7" =
target=3D"_blank">https://github.com/apache/cassandra/blob/cassandra-1.1/N=
EWS.txt#L17</a></div>

<div><br></div><div><div class=3D"im"><blockquote type=3D"cite"><div =
class=3D"gmail_quote"><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex">

Today I changed my cassandra.yml and restart this same server to apply =
my conf.</blockquote></div></blockquote></div><div>What changes did you =
make ?</div><div><br></div><div><div class=3D"im"><blockquote =
type=3D"cite"><div class=3D"gmail_quote">

<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex">I just noticed that my homepage (which =
uses a Cassandra counter and refreshes every sec) shows me 4 different =
values. 2 of them repeatedly (5000 and 4000) and the 2 other some rare =
times (5500 and 3800)</blockquote>

</div></blockquote></div>Are you saying that a particular counter column =
is giving different values for different reads ?&nbsp;</div><div>What CL =
are you using =
?</div><div><br></div><div>Cheers</div><div><br></div><div><div>
<span =
style=3D"text-indent:0px;letter-spacing:normal;font-variant:normal;text-al=
ign:-webkit-auto;font-style:normal;font-weight:normal;line-height:normal;b=
order-collapse:separate;text-transform:none;font-size:medium;white-space:n=
ormal;font-family:Helvetica;word-spacing:0px">

<div style=3D"word-wrap:break-word">

<span =
style=3D"text-indent:0px;letter-spacing:normal;font-variant:normal;font-st=
yle:normal;font-weight:normal;line-height:normal;border-collapse:separate;=
text-transform:none;font-size:medium;white-space:normal;font-family:Helvet=
ica;word-spacing:0px"><div style=3D"word-wrap:break-word">

<div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com/" =
target=3D"_blank">http://www.thelastpickle.com</a></div></div></div></span=
></div>

</span>
</div><div><div class=3D"h5">

<br><div><div>On 31/10/2012, at 3:39 AM, Jason Wee &lt;<a =
href=3D"mailto:peichieh@gmail.com" =
target=3D"_blank">peichieh@gmail.com</a>&gt; wrote:</div><br><blockquote =
type=3D"cite">maybe enable the debug in log4j-server.properties and =
going through the log to see what actually happen?<br>

<br><div class=3D"gmail_quote">On Tue, Oct 30, 2012 at 7:31 PM, Alain =
RODRIGUEZ <span dir=3D"ltr">&lt;<a href=3D"mailto:arodrime@gmail.com" =
target=3D"_blank">arodrime@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div>Hi,&nbsp;</div><div><br></div><div>I =
have an issue with counters, yesterday I had a lot of ununderstandable =
reads/sec on one server. I finally restart Cassandra. It didn't solve =
the problem so I stopped Cassandra again on that node and restart my ec2 =
server. This solved the issue (1800 r/s to 100 r/s).</div>


<div><br></div><div>Today I changed my cassandra.yml and restart this =
same server to apply my conf.</div><div><br></div><div>I just noticed =
that my homepage (which uses a Cassandra counter and refreshes every =
sec) shows me 4 different values. 2 of them repeatedly (5000 and 4000) =
and the 2 other some rare times (5500 and 3800)</div>


<div><br></div><div>Only the counters made today and yesterday are =
concerned.</div><div><br></div><div>I performed a repair without =
success. These data are the heart of our business so if someone had any =
clue on it, I would be really grateful...</div>


<div><br></div><div>The sooner the better, I am in production with these =
random =
counters.</div><div><br></div><div>Alain</div><div><br></div><div>INFO:</d=
iv><div><br></div><div><div>My environnement is 2 nodes (EC2 large), RF =
2, CL.ONE (R &amp; W), Random Partitioner.</div>


<div><br></div><div><div>xxx.xxx.xxx.241 &nbsp; &nbsp;eu-west &nbsp; =
&nbsp; 1b &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Up &nbsp; &nbsp; Normal =
&nbsp;151.95 GB &nbsp; &nbsp; &nbsp; 50.00% &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;0</div><div>xxx.xxx.xxx.109 &nbsp; &nbsp;eu-west =
&nbsp; &nbsp; 1b &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Up &nbsp; &nbsp; =
Normal &nbsp;117.71 GB &nbsp; &nbsp; &nbsp; 50.00% &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;85070591730234615865843651857942052864</div>


</div></div><div><br></div><div>Here is my conf:&nbsp;<a =
href=3D"http://pastebin.com/5cMuBKDt" =
target=3D"_blank">http://pastebin.com/5cMuBKDt</a></div><div><br></div><di=
v><br></div>
</blockquote></div><br>
=
</blockquote></div><br></div></div></div></div></div></div></blockquote></=
div><br></div>
</blockquote></div><br></body></html>=

--Apple-Mail=_27C4FA32-2CDD-4698-A85A-221410A2EDB2--