cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Janne Jalkanen (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected
Date Thu, 10 Jan 2013 07:50:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549412#comment-13549412
] 

Janne Jalkanen edited comment on CASSANDRA-4417 at 1/10/13 7:49 AM:
--------------------------------------------------------------------

I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight upgrade from
1.0.12 to 1.1.8; no topology changes.  I see two invalid shard IDs, counts differ by more
than one - sometimes even by 3000 or more.  Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* increment by one,
but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every node in a rolling
fashion.  Then I did upgradesstables and repair -pr on every node when the entire cluster
had been upgraded. Environment is Ubuntu Linux 12.04 LTS.
                
      was (Author: jalkanen):
    I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight upgrade from
1.0.12 to 1.1.8; no topology changes.  I see two invalid shard IDs, counts differ by more
than one - sometimes even by 3000 or more.  Seems random to my eyes.

Our counters are in a composite column family, no TTLs in use.  We *mostly* increment by one,
but sometimes more.

I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every node in a rolling
fashion.  Then I did upgradesstables and repair -pr on every node when the entire cluster
had been upgraded.
                  
> invalid counter shard detected 
> -------------------------------
>
>                 Key: CASSANDRA-4417
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4417
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.1
>         Environment: Amazon Linux
>            Reporter: Senthilvel Rangaswamy
>         Attachments: cassandra-mck.log.bz2, err.txt
>
>
> Seeing errors like these:
> 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; (17bfd850-ac52-11e1-0000-6ecd0b5b61e7,
1, 13) and (17bfd850-ac52-11e1-0000-6ecd0b5b61e7, 1, 1) differ only in count; will pick highest
to self-heal; this indicates a bug or corruption generated a bad counter shard
> What does it mean ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message