From user-return-27783-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Jul 24 01:49:38 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E76C1D29B for ; Tue, 24 Jul 2012 01:49:37 +0000 (UTC) Received: (qmail 84687 invoked by uid 500); 24 Jul 2012 01:49:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 84669 invoked by uid 500); 24 Jul 2012 01:49:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 84657 invoked by uid 99); 24 Jul 2012 01:49:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2012 01:49:35 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of driftx@gmail.com designates 209.85.212.172 as permitted sender) Received: from [209.85.212.172] (HELO mail-wi0-f172.google.com) (209.85.212.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2012 01:49:28 +0000 Received: by wibhm11 with SMTP id hm11so2426048wib.7 for ; Mon, 23 Jul 2012 18:49:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=z9pB6gqRY+gU6aLqLWHxmApSL+KapHG4wcilOXj6USc=; b=XizanxxDnFZFo7H6ulXiI6TyCKG8BM8324AUg73a89V5DELIsR6SVKuKexBu4PW0za wSgAMDB8CWGOoCNYPR2ez7nziCNfiY6Sds55+41JEJvxv1XHzrq2uPI39HMlfwl2Ov9a CLQXk8lS2SYJthmGQYm4Ntg3uqLurT77ox+v5hnOrJpAq/039QwZyo7Axw9Rj1x37pWQ 8rqPqS9mMZ18rGV4RJUMq50dDNYMBoBUfAKRBD1YmbRAC//X3E9yzSFGHBTXcZpCdqlo zYdImLg0syv0gtLKlQA2zvH4K9pHghFXbOxDQWTqVi4ybzT1aW3t3GL0lP7TWNCmNMVn nXtQ== Received: by 10.216.240.72 with SMTP id d50mr4671846wer.29.1343094547695; Mon, 23 Jul 2012 18:49:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.220.164 with HTTP; Mon, 23 Jul 2012 18:48:47 -0700 (PDT) In-Reply-To: References: From: Brandon Williams Date: Mon, 23 Jul 2012 20:48:47 -0500 Message-ID: Subject: Re: Bringing a dead node back up after fixing hardware issues To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 On Mon, Jul 23, 2012 at 6:26 PM, Eran Chinthaka Withana wrote: > Method 1: I copied the data from all the nodes in that data center, into the > repaired node, and brought it back up. But because of the rate of updates > happening, the read misses started going up. That's not really a good method when you scale up and the amount of data in the cluster won't fit on a single machine. > Method 2: I issued a removetoken command for that node's token and let the > cluster stream the data into relevant nodes. At the end of this process, the > dead node was not showing up in the ring output. Then I brought the node > back up. I was expecting, Cassandra to first stream data into the new node > (which happens to be the dead node which was in the cluster earlier) and > once its done then make it serve reads. But, in the server log, I can see as > soon the node comes up, it started serving reads, creating a large number of > read misses. Removetoken is for dead nodes, so the node has no way of locally knowing it shouldn't be a cluster member any longer when it starts up. Instead if you had decommissioned, it would have saved a flag to indicate it should bootstrap at the next startup. > So the question is, what is the best way to bring back a dead node (once its > hardware issues are fixed) without impacting read misses? Increase your consistency level. Run a repair on the node once it's back up, unless the repair time took longer than gc_grace, in which case you need to removetoken it, delete all the data, and bootstrap it back in if you don't want anything deleted to resurrect. -Brandon