Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AB9AE4B92 for ; Fri, 17 Jun 2011 16:13:08 +0000 (UTC) Received: (qmail 6570 invoked by uid 500); 17 Jun 2011 16:13:08 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 6547 invoked by uid 500); 17 Jun 2011 16:13:08 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 6537 invoked by uid 99); 17 Jun 2011 16:13:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 16:13:08 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 16:13:07 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 6A65641EB79 for ; Fri, 17 Jun 2011 16:12:47 +0000 (UTC) Date: Fri, 17 Jun 2011 16:12:47 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: <1223775745.15381.1308327167432.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <4618939.15076.1308321647406.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (CASSANDRA-2788) Add startup option renew the NodeId (for counters) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051159#comment-13051159 ] Jonathan Ellis commented on CASSANDRA-2788: ------------------------------------------- Pasting Sylvain's explanation from IRC: {quote} Let's me take a small example: Suppose two node A and B. Initially their node_id will be respectively A1 and B1. Each counter will thus have two components, A1 and B1. Now suppose you renew the node_id of A -> A2 because of a corruption. Soon enough, the counters will have 3 components A1, A2 and B1. Renew that yet another time and the counter context will be A1, A2, A3 and B1. It grows, which is not cool. But because we know that nobody will ever increment A1 and A2 anymore (A3 is the active node id for A), we can merge them (we have to wait for gc_grace and stuff for that be correct etc... but we do it) So basically we try to keep the context as small as can be. If you nuke NodeIdInfo, right now the code won't be able to do that anymore and you will stay with a bigger that necessary context for all the counters. So just renewing is more efficient in that sense. But nuking the system table is still 'correct' as far as returning the correct count is involved. {quoted} > Add startup option renew the NodeId (for counters) > -------------------------------------------------- > > Key: CASSANDRA-2788 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2788 > Project: Cassandra > Issue Type: Improvement > Affects Versions: 0.8.0 > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Priority: Minor > Labels: counters > Fix For: 0.8.2 > > Attachments: 0001-Option-to-renew-the-NodeId-on-startup.patch > > > If an sstable of a counter column family is corrupted, the only safe solution a user have right now is to: > # Remove the NodeId System table to force the node to regenerate a new NodeId (and thus stop incrementing on it's previous, corrupted, subcount) > # Remove all the sstables for that column family on that node (this is important because otherwise the node will never get "repaired" for it's previous subcount) > This is far from being ideal, but I think this is the price we pay for avoiding the read-before-write. In any case, the first step (remove the NodeId system table) happens to remove the list of the old NodeId this node has, which could prevent us for merging the other potential previous nodeId. This is ok but sub-optimal. This ticket proposes to add a new startup flag to make the node renew it's NodeId, thus replacing this first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira