From user-return-26574-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu May 31 15:20:58 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 387E3CF7A for ; Thu, 31 May 2012 15:20:58 +0000 (UTC) Received: (qmail 7235 invoked by uid 500); 31 May 2012 15:20:55 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 7217 invoked by uid 500); 31 May 2012 15:20:55 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 7209 invoked by uid 99); 31 May 2012 15:20:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 31 May 2012 15:20:55 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of cbrophy@zulily.com designates 209.85.215.44 as permitted sender) Received: from [209.85.215.44] (HELO mail-lpp01m010-f44.google.com) (209.85.215.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 31 May 2012 15:20:49 +0000 Received: by lagv3 with SMTP id v3so869129lag.31 for ; Thu, 31 May 2012 08:20:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zulily.com; s=google; h=mime-version:date:message-id:subject:from:to:content-type; bh=YMArveF7TUi6IRUB6HTY+6RVNkpP7C68QrXiQrFLkfY=; b=WIoRPg0/VPUGDDMM8QFnKUl0+56kst4duGieB+Po60Oi/UF8SAV6H0R+Fd8taIe8v1 H5+XC+O3Si5l+xrtg5IfN0DyB6k1vvjL9d7SsIvfmorXy9lQQCWYyq1aOL4g6Eu7Osyd BrlKFK/5rflDuOjNKhOJXTgFjUZCaNTkuIox8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :x-gm-message-state; bh=YMArveF7TUi6IRUB6HTY+6RVNkpP7C68QrXiQrFLkfY=; b=erzudYizMq/kzRSnMZ89b4HNN+oeQ0t9j1m5bGwamMJVtznoTlz+vQ92AOEEC+Djnc CpJJnUnrQLurDJ5EqKn887jvPeCRTFYHaWqXgneTVM7oQgEaGsNLN1FyyiF9EvvpzLm6 msIQEm8OFXIdulsH4o1xJIxd5AZ8pdgSAHG+BPEibJ6/G/46mYDrjBMG/humDS4RLmWC BhHMAmbxUktyAGma7mX1BuoyanJR5j2K79nqwBVmAckcgFcTD9B2MPJOSIc/QRFkA7ed bzg7knCuOThR/DX6iUnmSIhZfRCSO3JTfqtofzi5mQuLKAKt8iTDiEE6N77HUy+au3kE 0/Cw== MIME-Version: 1.0 Received: by 10.112.45.230 with SMTP id q6mr179747lbm.94.1338477628688; Thu, 31 May 2012 08:20:28 -0700 (PDT) Received: by 10.112.106.130 with HTTP; Thu, 31 May 2012 08:20:28 -0700 (PDT) Date: Thu, 31 May 2012 08:20:28 -0700 Message-ID: Subject: Invalid Counter Shard errors? From: Charles Brophy To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=bcaec554de609907b904c1569bbd X-Gm-Message-State: ALoCoQnk6djDnMVptn7aNpGdK6pDJ30oomj8FmOmGrLngaiD1/nP8wZguMlg1Euv48tdY5QwakW+ --bcaec554de609907b904c1569bbd Content-Type: text/plain; charset=ISO-8859-1 Hi guys, We're running a three node cluster of cassandra 1.1 servers, originally 1.0.7 and immediately after the upgrade the error logs of all three servers began filling up with the following message: ERROR [ReplicateOnWriteStage:177] 2012-05-31 08:17:02,236 CounterContext.java (line 381) invalid counter shard detected; (3438afc0-7e71-11e1-0000-da5a9d01e7f7, 3, 4) and (3438afc0-7e71-11e1-0000-da5a9d01e7f7, 3, 7) differ only in count; will pick highest to self-heal; this indicates a bug or corruption generated a bad counter shard ERROR [ValidationExecutor:20] 2012-05-31 08:17:01,570 CounterContext.java (line 381) invalid counter shard detected; (343cf580-7e71-11e1-0000-ebc411012bff, 14, 27) and (343cf580-7e71-11e1-0000-ebc411012bff, 14, 21) differ only in count; will pick highest to self-heal; this indicates a bug or corruption generated a bad counter shard The counts change but the errors are constant. What is the best course of action? Google only turns up the source code for these errors. Thanks! Charles --bcaec554de609907b904c1569bbd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi guys,

We're running a three node cluster of cassa= ndra 1.1 servers, originally 1.0.7 and immediately after the upgrade the er= ror logs of all three servers began filling up with the following message:<= /div>

ERROR [ReplicateOnWriteStage:177] 2012-05-31 08:17= :02,236 CounterContext.java (line 381) invalid counter shard detected; (343= 8afc0-7e71-11e1-0000-da5a9d01e7f7, 3, 4) and (3438afc0-7e71-11e1-0000-da5a9= d01e7f7, 3, 7) differ only in count; will pick highest to self-heal; this i= ndicates a bug or corruption generated a bad counter shard

ERROR [ValidationExecutor:20] 2012-05-31 08:17:01= ,570 CounterContext.java (line 381) invalid counter shard detected; (343cf5= 80-7e71-11e1-0000-ebc411012bff, 14, 27) and (343cf580-7e71-11e1-0000-ebc411= 012bff, 14, 21) differ only in count; will pick highest to self-heal; this = indicates a bug or corruption generated a bad counter shard

The counts change but the errors are constant. What is = the best course of action? Google only turns up the source code for these e= rrors.

Thanks!
Charles


--bcaec554de609907b904c1569bbd--