Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 06A42E8E5 for ; Mon, 18 Mar 2013 21:58:42 +0000 (UTC) Received: (qmail 1771 invoked by uid 500); 18 Mar 2013 21:58:39 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 1747 invoked by uid 500); 18 Mar 2013 21:58:39 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 1726 invoked by uid 99); 18 Mar 2013 21:58:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Mar 2013 21:58:39 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dane@optimalsocial.com designates 209.85.210.177 as permitted sender) Received: from [209.85.210.177] (HELO mail-ia0-f177.google.com) (209.85.210.177) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Mar 2013 21:58:33 +0000 Received: by mail-ia0-f177.google.com with SMTP id y25so5785486iay.36 for ; Mon, 18 Mar 2013 14:58:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=TqP284xBMvIn00EbBt6PtLAEQQdPaaqL/cVLuho/wqo=; b=RR6DYratjvVnYr8RQISdXcxx/i6DOHfiXl20kUTBgM5YHCbOy39VOZQsKclfk4M1eS xPohcC+mnPn+ayZ3A0DkgdLCryINUb3CxE/BcrAWAGj8ii1xXu1s3cRWPZs8Ofx/PT+G cu5V2USN8BdHC/OSHlkpMhg4PvrRrZhMUmOtYTaWDJa/AjJ8yr08GS9fC77uM3FjtuJ6 ta2OP7nRnM2YOUkqb8Pl6eiwpKr5zxC1dqEW9Vs+GcDRG8C23U+glMrLy1tbJ8mGHBFu 5FNEmaBw6S1yKjKnFI/xElzy3KHG3bzwtajfXFyzSh5gFYBA3Cj/ou7wPc7Pw7aY5EZf Bx+A== MIME-Version: 1.0 X-Received: by 10.50.13.228 with SMTP id k4mr318801igc.30.1363643891711; Mon, 18 Mar 2013 14:58:11 -0700 (PDT) Received: by 10.64.128.170 with HTTP; Mon, 18 Mar 2013 14:58:11 -0700 (PDT) Date: Mon, 18 Mar 2013 14:58:11 -0700 Message-ID: Subject: Errors on replica nodes halt repair From: Dane Miller To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQknAlDa1x/1JYfzYzdcbCkZEnLSUIO+1sDmDcmQxQznsoJ09Zz1h+IB0+IqzC+bo7ll9kEf X-Virus-Checked: Checked by ClamAV on apache.org I'm having trouble completing a repair on several of my nodes due to errors during compaction. This is a 6 node cluster using the simple replication strategy, rf=3, with each node assigned a single token. I'm running "nodetool repair -pr" on node1, which progresses until a specific keyspace then appears to hang. On the replicas, nodes 2 and 3, I find errors that seem related to compaction. I'm considering running a scrub if I can determine the column family where the errors occur. But I'm not confident I understand the problem, and I'm wary of making it worse. What's the best way to recover from these errors? Note, this cluster was recently upgraded from 1.1.6 to 1.2.1, then to 1.2.2. node2 ERROR [Thread-97275] 2013-03-13 23:51:30,359 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-97275,5,main] java.lang.RuntimeException: Last written key DecoratedKey(161894077670705622023702574770140080251, 757365723a3a313a3a373537363636393130) >= current key DecoratedKey( ERROR [CompactionExecutor:7697] 2013-03-15 21:45:59,584 CassandraDaemon.java (line 133) Exception in thread Thread[Compactio nExecutor:7697,1,main] java.lang.AssertionError: originally calculated column size of 321455446 but now it is 321455483 node3 ERROR [Thread-97525] 2013-03-13 23:51:44,788 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-97525,5,main] java.lang.RuntimeException: Last written key DecoratedKey(161894077670705622023702574770140080251, 757365723a3a313a3a373537363636393130) >= current key DecoratedKey ERROR [Thread-97564] 2013-03-13 23:54:03,403 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-97564,5,main] java.lang.RuntimeException: Last written key DecoratedKey(152706250731373455824787766459206671594, 757365723a3a313a3a333434313038323239) >= current key DecoratedKey( ERROR [Thread-661] 2013-03-15 21:02:05,981 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-661,5,main] java.lang.NegativeArraySizeException Details: 6 node cluster cassandra 1.2.2 - single token per node RandomPartitioner, EC2Snitch Replication: SimpleStrategy, rf=3 Ubuntu 10.10 x86_64 EC2 m1.large Dane