Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 622B0D6D0 for ; Thu, 1 Nov 2012 08:41:32 +0000 (UTC) Received: (qmail 91012 invoked by uid 500); 1 Nov 2012 08:41:30 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 90758 invoked by uid 500); 1 Nov 2012 08:41:29 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 90737 invoked by uid 99); 1 Nov 2012 08:41:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2012 08:41:29 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a92.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2012 08:41:24 +0000 Received: from homiemail-a92.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a92.g.dreamhost.com (Postfix) with ESMTP id C51CE3DC06E for ; Thu, 1 Nov 2012 01:48:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=0y5NLredM1aHinMUZxgcILIVUs s=; b=fpytnsrAPjXH5VDcszMNFEPq56jpgm3ZO5/KOGOXiPxsEYlZEqWcUMbp7o Z+THg2YgEGNf3W7AA1TE6lLSrQ0GboBCy8Vf3eDze8FGbvIu8o9pAX+9sfmOlAFR 4V3zu/7K4ZZseS3/bT6jjw0wbdvX6dyxyvJ+7+C4dKbKIXcI4= Received: from [172.16.1.10] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a92.g.dreamhost.com (Postfix) with ESMTPSA id 0C2293DC05E for ; Thu, 1 Nov 2012 01:48:53 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_27C4FA32-2CDD-4698-A85A-221410A2EDB2" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Multiple counters value after restart Date: Thu, 1 Nov 2012 21:41:01 +1300 References: <7AF531E1-50B2-403D-848F-BE4241411E45@thelastpickle.com> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_27C4FA32-2CDD-4698-A85A-221410A2EDB2 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 > "What CL are you using ?" >=20 > I think this can be what causes the issue. I'm writing and reading at = CL ONE. I didn't drain before stopping Cassandra and this may have = produce a fail in the current counters (those which were being written = when I stopped a server). My first thought is to use QUOURM. But with only two nodes it's hard to = get strong consistency using QUOURM. =20 Can you try it thought, or run a repair ?=20 > But isn't Cassandra suppose to handle a server crash ? When a server = crashes I guess it don't drain before... I was asking to understand how you did the upgrade.=20 Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 1/11/2012, at 11:39 AM, Alain RODRIGUEZ wrote: > "What version of cassandra are you using ?" >=20 > 1.1.2 >=20 > "Can you explain this further?" >=20 > I had an unexplained amount of reads (up to 1800 r/s and 90 Mo/s) on = one server the other was doing about 200 r/s and 5 Mo/s max. I fixed it = by rebooting the server. This server is dedicated to cassandra. I can't = tell you more about it 'cause I don't get it... But a simple Cassandra = restart wasn't enough. >=20 > "Was something writing to the cluster ?" >=20 > Yes we are having some activity and perform about 600 w/s. >=20 > "Did you drain for the upgrade ?" >=20 > We upgrade a long time ago and to 1.1.2. This warning is about the = version 1.1.6. >=20 > "What changes did you make ?" >=20 > In the cassandra.yaml I just change the = "compaction_throughput_mb_per_sec" property to slow down my compaction a = bit. I don't think the problem come from here. >=20 > "Are you saying that a particular counter column is giving different = values for different reads ?" >=20 > Yes, this is exactly what I was saying. Sorry if something is wrong = with my English, it's not my mother tongue. >=20 > "What CL are you using ?" >=20 > I think this can be what causes the issue. I'm writing and reading at = CL ONE. I didn't drain before stopping Cassandra and this may have = produce a fail in the current counters (those which were being written = when I stopped a server). >=20 > But isn't Cassandra suppose to handle a server crash ? When a server = crashes I guess it don't drain before... >=20 > Thank you for your time Aaron, once again. >=20 > Alain >=20 >=20 >=20 > 2012/10/31 aaron morton > What version of cassandra are you using ? >=20 >> I finally restart Cassandra. It didn't solve the problem so I = stopped Cassandra again on that node and restart my ec2 server. This = solved the issue (1800 r/s to 100 r/s). > Can you explain this further? > Was something writing to the cluster ? > Did you drain for the upgrade ? = https://github.com/apache/cassandra/blob/cassandra-1.1/NEWS.txt#L17 >=20 >> Today I changed my cassandra.yml and restart this same server to = apply my conf. >=20 > What changes did you make ? >=20 >> I just noticed that my homepage (which uses a Cassandra counter and = refreshes every sec) shows me 4 different values. 2 of them repeatedly = (5000 and 4000) and the 2 other some rare times (5500 and 3800) > Are you saying that a particular counter column is giving different = values for different reads ?=20 > What CL are you using ? >=20 > Cheers >=20 > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com >=20 > On 31/10/2012, at 3:39 AM, Jason Wee wrote: >=20 >> maybe enable the debug in log4j-server.properties and going through = the log to see what actually happen? >>=20 >> On Tue, Oct 30, 2012 at 7:31 PM, Alain RODRIGUEZ = wrote: >> Hi,=20 >>=20 >> I have an issue with counters, yesterday I had a lot of = ununderstandable reads/sec on one server. I finally restart Cassandra. = It didn't solve the problem so I stopped Cassandra again on that node = and restart my ec2 server. This solved the issue (1800 r/s to 100 r/s). >>=20 >> Today I changed my cassandra.yml and restart this same server to = apply my conf. >>=20 >> I just noticed that my homepage (which uses a Cassandra counter and = refreshes every sec) shows me 4 different values. 2 of them repeatedly = (5000 and 4000) and the 2 other some rare times (5500 and 3800) >>=20 >> Only the counters made today and yesterday are concerned. >>=20 >> I performed a repair without success. These data are the heart of our = business so if someone had any clue on it, I would be really grateful... >>=20 >> The sooner the better, I am in production with these random counters. >>=20 >> Alain >>=20 >> INFO: >>=20 >> My environnement is 2 nodes (EC2 large), RF 2, CL.ONE (R & W), Random = Partitioner. >>=20 >> xxx.xxx.xxx.241 eu-west 1b Up Normal 151.95 GB = 50.00% 0 >> xxx.xxx.xxx.109 eu-west 1b Up Normal 117.71 GB = 50.00% 85070591730234615865843651857942052864 >>=20 >> Here is my conf: http://pastebin.com/5cMuBKDt >>=20 >>=20 >>=20 >=20 >=20 --Apple-Mail=_27C4FA32-2CDD-4698-A85A-221410A2EDB2 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1
"What CL are you using = ?"
I think this can be what causes = the issue. I'm writing and reading at CL ONE. I didn't drain before = stopping Cassandra and this may have produce a fail in the current = counters (those which were being written when I stopped a = server).
My first thought is to use = QUOURM. But with only two nodes it's hard to get strong consistency = using  QUOURM.  
Can you try it thought, or = run a repair ? 

But isn't Cassandra suppose to handle a server crash ? When a = server crashes I guess it don't drain = before...
I was asking to understand = how you did the upgrade. 

Cheers

http://www.thelastpickle.com

On 1/11/2012, at 11:39 AM, Alain RODRIGUEZ <arodrime@gmail.com> = wrote:

"What version of = cassandra are you using ?"

1.1.2

"Can you explain = this further?"

I had an = unexplained amount of reads (up to 1800 r/s and 90 Mo/s) on one server = the other was doing about 200 r/s and 5 Mo/s max. I fixed it by = rebooting the server. This server is dedicated to cassandra. I can't = tell you more about it 'cause I don't get it... But a simple Cassandra = restart wasn't enough.

"Was something = writing to the cluster ?"

Yes we are = having some activity and perform about 600 w/s.

"Did you drain for = the upgrade ?"

We upgrade a long = time ago and to 1.1.2. This warning is about the version = 1.1.6.

"What changes did = you make ?"

In the = cassandra.yaml I just change the = "compaction_throughput_mb_per_sec" property to slow down my = compaction a bit. I don't think the problem come from here.

"Are you saying = that a particular counter column is giving different values for = different reads ?"

Yes, this is = exactly what I was saying. Sorry if something is wrong with my English, = it's not my mother tongue.

"What CL are you = using ?"

I think this can = be what causes the issue. I'm writing and reading at CL ONE. I didn't = drain before stopping Cassandra and this may have produce a fail in the = current counters (those which were being written when I stopped a = server).

But isn't = Cassandra suppose to handle a server crash ? When a server crashes I = guess it don't drain before...

Thank you = for your time Aaron, once again.

Alain



2012/10/31 aaron morton <aaron@thelastpickle.com>
What version of cassandra are you = using ?

 I finally restart Cassandra. It didn't solve the problem so I = stopped Cassandra again on that node and restart my ec2 server. This = solved the issue (1800 r/s to 100 = r/s).
Can you explain this = further?
Was something writing to the cluster ?

Today I changed my cassandra.yml and restart this same server to apply = my conf.
What changes did you = make ?

I just noticed that my homepage (which = uses a Cassandra counter and refreshes every sec) shows me 4 different = values. 2 of them repeatedly (5000 and 4000) and the 2 other some rare = times (5500 and 3800)
Are you saying that a particular counter column = is giving different values for different reads ? 
What CL = are you using = ?

Cheers

-----------------
Aaron Morton
Freelance = Developer
@aaronmorton

On 31/10/2012, at 3:39 AM, Jason Wee <peichieh@gmail.com> wrote:

maybe enable the debug in log4j-server.properties and = going through the log to see what actually happen?

On Tue, Oct 30, 2012 at 7:31 PM, Alain = RODRIGUEZ <arodrime@gmail.com> wrote:
Hi, 

I = have an issue with counters, yesterday I had a lot of ununderstandable = reads/sec on one server. I finally restart Cassandra. It didn't solve = the problem so I stopped Cassandra again on that node and restart my ec2 = server. This solved the issue (1800 r/s to 100 r/s).

Today I changed my cassandra.yml and restart this = same server to apply my conf.

I just noticed = that my homepage (which uses a Cassandra counter and refreshes every = sec) shows me 4 different values. 2 of them repeatedly (5000 and 4000) = and the 2 other some rare times (5500 and 3800)

Only the counters made today and yesterday are = concerned.

I performed a repair without = success. These data are the heart of our business so if someone had any = clue on it, I would be really grateful...

The sooner the better, I am in production with these = random = counters.

Alain

INFO:

My environnement is 2 nodes (EC2 large), RF = 2, CL.ONE (R & W), Random Partitioner.

xxx.xxx.xxx.241    eu-west   =   1b          Up     Normal =  151.95 GB       50.00%         =      0
xxx.xxx.xxx.109    eu-west =     1b          Up     = Normal  117.71 GB       50.00%       =        85070591730234615865843651857942052864




=



= --Apple-Mail=_27C4FA32-2CDD-4698-A85A-221410A2EDB2--