Return-Path: Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: (qmail 61886 invoked from network); 27 Jul 2010 20:54:40 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 27 Jul 2010 20:54:40 -0000 Received: (qmail 43222 invoked by uid 500); 27 Jul 2010 20:54:40 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 43177 invoked by uid 500); 27 Jul 2010 20:54:39 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 43168 invoked by uid 99); 27 Jul 2010 20:54:39 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jul 2010 20:54:39 +0000 X-ASF-Spam-Status: No, hits=-1996.4 required=10.0 tests=ALL_TRUSTED,FS_REPLICA X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jul 2010 20:54:38 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o6RKsI0W007934 for ; Tue, 27 Jul 2010 20:54:18 GMT Message-ID: <12856017.30791280264058388.JavaMail.jira@thor> Date: Tue, 27 Jul 2010 16:54:18 -0400 (EDT) From: "Nick Bailey (JIRA)" To: commits@cassandra.apache.org Subject: [jira] Updated: (CASSANDRA-1216) removetoken drops node from ring before re-replicating its data is finished In-Reply-To: <12967337.13841277216936683.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Bailey updated CASSANDRA-1216: ----------------------------------- Attachment: 0001-Modify-removeToken-to-be-similar-to-decommission.patch 0002-Fixes-to-old-tests.patch 0003-Additional-unit-tests-for-removeToken.patch Some fixes and tests added. There is one thing that still needs to be fixed. * Currently the call to removeToken blocks either: ** until all nodes confirm that they have replicated the data for the dead node. ** or a timeout is reached * I'm not sure what the timeout for this should be. Additionally when nodes throughout the ring attempt to replicate data there should be a similar timeout before they give up on a source and retry. * Also clients may timeout before the timeout is even reached or all the data is replicated. I'm not sure how the user will be able to determine if the remove finished correctly or repair should be run. > removetoken drops node from ring before re-replicating its data is finished > --------------------------------------------------------------------------- > > Key: CASSANDRA-1216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1216 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Jonathan Ellis > Assignee: Nick Bailey > Fix For: 0.7.0 > > Attachments: 0001-Modify-removeToken-to-be-similar-to-decommission.patch, 0002-Fixes-to-old-tests.patch, 0003-Additional-unit-tests-for-removeToken.patch > > > this means that if something goes wrong during the re-replication (e.g. a source node is restarted) there is (a) no indication that anything has gone wrong and (b) no way to restart the process (other than the Big Hammer of running repair) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.