Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BBC979126 for ; Wed, 23 May 2012 09:50:52 +0000 (UTC) Received: (qmail 15953 invoked by uid 500); 23 May 2012 09:50:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 15927 invoked by uid 500); 23 May 2012 09:50:50 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 15783 invoked by uid 99); 23 May 2012 09:50:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 May 2012 09:50:49 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a93.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 May 2012 09:50:44 +0000 Received: from homiemail-a93.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a93.g.dreamhost.com (Postfix) with ESMTP id BD9CA8405E for ; Wed, 23 May 2012 02:50:15 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=P1N8XoopU3 jSIb9ATlMa3yjTCAxAU+1RGE0OTGgvdruShATeNfPVpjGMG3U/LY9LHjweWfc25a fPTTZXitvweFFQDNSuLkYz7ZMZlw5o8ln++42yMZ75n1tayhF3kkDlBi38naRJA+ JSGPtDz6iEiFmv3PtQ8/FHQogrJhsk3OY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=wJWTIeNcCa3Qx+mv PSYJZWSAt18=; b=Xn9tCMhmbA4KiR3MDmVqVFKIGJeU0AraMpQ1KrJtq7iLYEpi VPYP+hPM268v2KLU/x01zzIy79uh9xaCDrH2zmyZw+/5WLNg5i9xtH/LtlRZcH/g EeRcopxQyPGMu7itQUT2gfTb+crreqIZAcJv9qQH32zikwXIudHFO550DSg= Received: from [172.16.1.4] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a93.g.dreamhost.com (Postfix) with ESMTPSA id 3DCAF8405C for ; Wed, 23 May 2012 02:50:15 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1257) Content-Type: multipart/alternative; boundary="Apple-Mail=_394E7813-12A3-4741-9D62-610C396530E0" Subject: Re: Tuning cassandra (compactions overall) Date: Wed, 23 May 2012 21:50:07 +1200 In-Reply-To: To: user@cassandra.apache.org References: <1620C181-AEFE-40A8-81B6-4909CE87AA92@thelastpickle.com> <758B20F0-3C07-4BCC-BF53-A3B10E1E7294@thelastpickle.com> <9BF61DF4-2366-4A89-9383-46EFD8BCBC65@thelastpickle.com> Message-Id: <0ECA1FB9-C4B5-4A65-A53A-74907442C7E8@thelastpickle.com> X-Mailer: Apple Mail (2.1257) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_394E7813-12A3-4741-9D62-610C396530E0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 I've not heard of anything like that in the recent versions. There were = some issues in the early 0.8 = https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L383 If you are on a recent version can you please create a jira ticket = https://issues.apache.org/jira/browse/CASSANDRA describing what you = think happened.=20 If you have kept the logs from the startup and can make them available = please do.=20 Thanks ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/05/2012, at 12:42 AM, Alain RODRIGUEZ wrote: > "not sure what you mean by > And after restarting the second one I have lost all the consistency of > my data. All my statistics since September are totally false now in > production >=20 > Can you give some examples?" >=20 > After restarting my 2 nodes (one after the other), All my counters > have become wrong. The counters values were modified by the restart. > Let's say I had a counter column called 20120101#click that value was > 569, after the restart the value has become 751. I think that all the > values have increased (I'm not sure) but all counters have increased > in differents way, some values have increased a lot other just a bit. >=20 > "Counter are not idempotent so if the client app retries TimedOut > requests you can get an over count. That should not result in lost > data." >=20 > Some of these counters haven't be written since September and have > still been modified by the restart. >=20 > "Have you been running repair ?" >=20 > Yes, Repair didn't helped. I have the feeling that repairing doesn't > work on counters. >=20 > I have restored the data now, but I am afraid of restarting any node. > I can remain in this position too long... --Apple-Mail=_394E7813-12A3-4741-9D62-610C396530E0 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 I've = not heard of anything like that in the recent versions. There were some = issues in the early 0.8 http= s://github.com/apache/cassandra/blob/trunk/NEWS.txt#L383

If you are on a recent version can you please create a jira = ticket https://issues.ap= ache.org/jira/browse/CASSANDRA describing what you think = happened. 

If you have kept the logs from = the startup and can make them available please = do. 

Thanks

http://www.thelastpickle.com

On 23/05/2012, at 12:42 AM, Alain RODRIGUEZ = wrote:

"not sure what you mean by
And after restarting = the second one I have lost all the consistency of
my data. All my = statistics since September are totally false now = in
production

Can you give some examples?"

After = restarting my 2 nodes (one after the other), All my counters
have = become wrong. The counters values were modified by the restart.
Let's = say I had a counter column called 20120101#click that value was
569, = after the restart the value has become 751. I think that all = the
values have increased (I'm not sure) but all counters have = increased
in differents way, some values have increased a lot other = just a bit.

"Counter are not idempotent so if the client app = retries TimedOut
requests you can get an over count. That should not = result in lost
data."

Some of these counters haven't be = written since September and have
still been modified by the = restart.

"Have you been running repair ?"

Yes, Repair = didn't helped. I have the feeling that repairing doesn't
work on = counters.

I have restored the data now, but I am afraid of = restarting any node.
I can remain in this position too = long...

= --Apple-Mail=_394E7813-12A3-4741-9D62-610C396530E0--