Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6FB07E222 for ; Thu, 10 Jan 2013 07:50:14 +0000 (UTC) Received: (qmail 52030 invoked by uid 500); 10 Jan 2013 07:50:14 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 52006 invoked by uid 500); 10 Jan 2013 07:50:14 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 51988 invoked by uid 99); 10 Jan 2013 07:50:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jan 2013 07:50:14 +0000 Date: Thu, 10 Jan 2013 07:50:13 +0000 (UTC) From: "Janne Jalkanen (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-4417) invalid counter shard detected MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549412#comment-13549412 ] Janne Jalkanen edited comment on CASSANDRA-4417 at 1/10/13 7:48 AM: -------------------------------------------------------------------- I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight upgrade from 1.0.12 to 1.1.8; no topology changes. I see two invalid shard IDs, counts differ by more than one - sometimes even by 3000 or more. Seems random to my eyes. Our counters are in a composite column family, no TTLs in use. We *mostly* increment by one, but sometimes more. I did disablegossip, disablethrift, drain, shutdown, upgrade, restart on every node in a rolling fashion. Then I did upgradesstables and repair -pr on every node when the entire cluster had been upgraded. was (Author: jalkanen): I'm seeing this while running repair -pr. Three-cluster node, RF 3. Straight upgrade from 1.0.12 to 1.1.8; no topology changes. I see two invalid shard IDs, counts differ by more than one - sometimes even by 3000 or more. Seems random to my eyes. Our counters are in a composite column family, no TTLs in use. We *mostly* increment by one, but sometimes more. I did disablegossip, disablethrift, drain, upgrade, restart on every node in a rolling fashion. Then I did upgradesstables and repair -pr on every node when the entire cluster had been upgraded. > invalid counter shard detected > ------------------------------- > > Key: CASSANDRA-4417 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4417 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.1.1 > Environment: Amazon Linux > Reporter: Senthilvel Rangaswamy > Attachments: cassandra-mck.log.bz2, err.txt > > > Seeing errors like these: > 2012-07-06_07:00:27.22662 ERROR 07:00:27,226 invalid counter shard detected; (17bfd850-ac52-11e1-0000-6ecd0b5b61e7, 1, 13) and (17bfd850-ac52-11e1-0000-6ecd0b5b61e7, 1, 1) differ only in count; will pick highest to self-heal; this indicates a bug or corruption generated a bad counter shard > What does it mean ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira