Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Thu, 10 Jan 2013 07:50:13 +0000 (UTC)
From: "Janne Jalkanen (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12597700.1341558050847.111959.1357804213989@arcas>
In-Reply-To: <JIRA.12597700.1341558050847@arcas>
References: <JIRA.12597700.1341558050847@arcas>
Subject: [jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard
 detected
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549412#comment-13549412 ] 

Janne Jalkanen edited comment on CASSANDRA-4417 at 1/10/13 7:48 AM:
--------------------------------------------------------------------

I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid shard IDs, counts differ by more than one - sometimes even by 3000 or more.  Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every node in a rolling fashion.  Then I did upgradesstables and repair -pr on every node when the entire cluster had been upgraded.
                
      was (Author: jalkanen):
    I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight upgrade from 1.0.12 to 1.1.8; no topology changes.  I see two invalid shard IDs, counts differ by more than one - sometimes even by 3000 or more.  Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* increment by one, but sometimes more.

I did disablegossip, disablethrift, drain, upgrade, restart on every node in a rolling fashion.  Then I did upgradesstables and repair -pr on every node when the entire cluster had been upgraded.
                  
> invalid counter shard detected 
> -------------------------------
>
>                 Key: CASSANDRA-4417
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.1
>         Environment: Amazon Linux
>            Reporter: Senthilvel Rangaswamy
>         Attachments: cassandra-mck.log.bz2, err.txt
>
>
> Seeing errors like these:
> 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; (17bfd850-ac52-11e1-0000-6ecd0b5b61e7, 1, 13) and (17bfd850-ac52-11e1-0000-6ecd0b5b61e7, 1, 1) differ only in count; will pick highest to self-heal; this indicates a bug or corruption generated a bad counter shard
> What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira