From user-return-28599-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu Sep 6 10:52:32 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B5DD99E2A for ; Thu, 6 Sep 2012 10:52:32 +0000 (UTC) Received: (qmail 45852 invoked by uid 500); 6 Sep 2012 10:52:30 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 45186 invoked by uid 500); 6 Sep 2012 10:52:24 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 45123 invoked by uid 99); 6 Sep 2012 10:52:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Sep 2012 10:52:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of arodrime@gmail.com designates 209.85.217.172 as permitted sender) Received: from [209.85.217.172] (HELO mail-lb0-f172.google.com) (209.85.217.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Sep 2012 10:52:16 +0000 Received: by lbky2 with SMTP id y2so1187200lbk.31 for ; Thu, 06 Sep 2012 03:51:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=bB60Ct8/FBXcD4SgBmtjnHssu0hoW6RqSsii/6B0Yw4=; b=t0Ieq5zAsrxfax8T/VHjvN3+wV7no1PvNDKgzMEg1BTu5li6Fggvk7bEXAJ9RFivad oLPUnJ0yq3FE89B4gwGUt4UMxSa5pgckxUA7pbrKgv22udlQE7XX6J+KQ2gIkDk/4LNL x2lgTyyslJ/VY0y5MFgGHT0IS1LSMA5PSJUhT4z0uezJI1/H+PcEarJwnyp0u7R4gXje 4gkOVUcYWbFhBtxdy8PlfRBttUHXpthOH/fVfZwSeCBnWws7cay3bx4XmraUZYcMFCCZ 1dx3kGkEs9CuwfBtApl4e4rrcon6KMGjQntEiJMvIAQErjw0jyERN1fTUphJazAC3KET QbZA== Received: by 10.152.104.146 with SMTP id ge18mr1572749lab.7.1346928715624; Thu, 06 Sep 2012 03:51:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.114.12.193 with HTTP; Thu, 6 Sep 2012 03:51:35 -0700 (PDT) In-Reply-To: References: From: Alain RODRIGUEZ Date: Thu, 6 Sep 2012 12:51:35 +0200 Message-ID: Subject: Re: Invalid Counter Shard errors? To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d04083e13a1f1b904c906474c --f46d04083e13a1f1b904c906474c Content-Type: text/plain; charset=ISO-8859-1 Hi nobody knows about this ? Alain 2012/9/3 Alain RODRIGUEZ > Hello, > > I'm running a 1.1.2 Cassandra 2 nodes wide cluster with RF=2 (CL = 1, > nodes are m1.large from Amazon). > > I had this error 524 times last month on the node 1 and 2805 time on > the second node. > > Should I worry about it ? How can I fix these errors ? > > Alain > > 2012/6/2 Peter Schuller : > >> We're running a three node cluster of cassandra 1.1 servers, originally > >> 1.0.7 and immediately after the upgrade the error logs of all three > servers > >> began filling up with the following message: > > > > The message you are receiving is new, but the problem it identifies is > > not. The checking for this condition, and the logging, was added so > > that certain kinds of counter corruption would be self-healed > > eventually instead of remaining forever incorrect. Likely nothing is > > wrong that wasn't before; you're just seeing it being logged now. > > > > And I can confirm having seen this on 1.1, so the root cause remains > > unknown as far as I can tell (had previously hoped the root cause were > > thread-unsafe shard merging, or one of the other counter related > > issues fixed during the 0.8 run). > > > > -- > > / Peter Schuller (@scode, http://worldmodscode.wordpress.com) > --f46d04083e13a1f1b904c906474c Content-Type: text/html; charset=ISO-8859-1 Hi nobody knows about this ?

Alain

2012/9/3 Alain RODRIGUEZ <arodrime@gmail.com>
Hello,

I'm running a 1.1.2 Cassandra 2 nodes wide cluster with RF=2 (CL = 1,
nodes are m1.large from Amazon).

I had this error 524 times last month on the node 1 and 2805 time on
the second node.

Should I worry about it ? How can I fix these errors ?

Alain

2012/6/2 Peter Schuller <peter.schuller@infidyne.com>:
>> We're running a three node cluster of cassandra 1.1 servers, originally
>> 1.0.7 and immediately after the upgrade the error logs of all three servers
>> began filling up with the following message:
>
> The message you are receiving is new, but the problem it identifies is
> not. The checking for this condition, and the logging, was added so
> that certain kinds of counter corruption would be self-healed
> eventually instead of remaining forever incorrect. Likely nothing is
> wrong that wasn't before; you're just seeing it being logged now.
>
> And I can confirm having seen this on 1.1, so the root cause remains
> unknown as far as I can tell (had previously hoped the root cause were
> thread-unsafe shard merging, or one of the other counter related
> issues fixed during the 0.8 run).
>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)

--f46d04083e13a1f1b904c906474c--