Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ACC6D118C0 for ; Thu, 7 Aug 2014 21:47:24 +0000 (UTC) Received: (qmail 78615 invoked by uid 500); 7 Aug 2014 21:47:16 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 78535 invoked by uid 500); 7 Aug 2014 21:47:16 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 78461 invoked by uid 99); 7 Aug 2014 21:47:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Aug 2014 21:47:16 +0000 X-ASF-Spam-Status: No, hits=2.7 required=5.0 tests=HTML_MESSAGE,HTML_OBFUSCATE_10_20,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vish.ramachandran@gmail.com designates 209.85.214.170 as permitted sender) Received: from [209.85.214.170] (HELO mail-ob0-f170.google.com) (209.85.214.170) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Aug 2014 21:47:12 +0000 Received: by mail-ob0-f170.google.com with SMTP id wp4so3383856obc.29 for ; Thu, 07 Aug 2014 14:46:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=trtpvl4ux+tyfFK72TBizUnHkVLfi0Gy0K2NXFPTlqs=; b=xVOXIdcKOgdo1eqkzUSN1W+UqfLjI6JM6h8vjskT9LvtHfahTK/67Zn7BCr0dRY+d+ Dm/RN24AfiWsf96bU0I+qm9P8pXGTJLIVFs2FD3rJmVJLaElXq+MYHAyqmT0CmzBOYmB 2srqNKdoT9J9s/JQbdJssWyuA2oz0m2b2P3DyuyGbGZN63i5Sjsn4WG5XiwUAIC6bLf3 uNWvDbFiT/rP8JJ8EqkR1V0UpGT//cJSFp+/zKcX8RuMAHUM9nv+42jKK6vIwUYPuuMZ 0eLGMZDEQ6g0/mD7/N2Sk3HxvGljiPKxJLOXEl2FsipsqVDvKWrMswC5l70EWyEOjp7L rrig== MIME-Version: 1.0 X-Received: by 10.60.179.114 with SMTP id df18mr25933384oec.76.1407448011631; Thu, 07 Aug 2014 14:46:51 -0700 (PDT) Received: by 10.76.130.143 with HTTP; Thu, 7 Aug 2014 14:46:51 -0700 (PDT) Date: Thu, 7 Aug 2014 14:46:51 -0700 Message-ID: Subject: Is nodetool cleanup necessary on nodes of different data center when new node is added From: Viswanathan Ramachandran To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=089e011609b8c5f13b0500110690 X-Virus-Checked: Checked by ClamAV on apache.org --089e011609b8c5f13b0500110690 Content-Type: text/plain; charset=UTF-8 I plan to have a multi data center Cassandra 2 setup with 2-4 nodes per data center and several 10s of data centers. We have keyspaces replicated on a certain number of nodes on *each* data center. Essentially, each data center has a logical ring that covers all token ranges. We have a vnode based deployment. So tokens should get assigned to the nodes automatically. Documentation at http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html suggests that addition of new node requires cleanup to be run on all other nodes of the cluster. However, it does not clarify the procedure in a multi-data center setup. My understanding is that nodetool cleanup removes data which no longer belongs to that node. When a new data center is being setup, we are creating completely new replicas and AFAICT, it does not result in data movement/rebalance outside of this new data center and hence there is no cleanup requirement on nodes of other data centers. Is someone able to confirm if my understanding is right, and cleanup is not required on nodes of other data centers? Thanks Vish --089e011609b8c5f13b0500110690 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

I plan to have a multi data center Cassandra 2 setup with 2-4 nodes per dat= a center and several 10s of data centers. We have keyspaces replicated on a= certain number of nodes on *each* data center. Essentially, each data cent= er has a logical ring that covers all token ranges. We have a vnode based d= eployment. So tokens should get assigned to the nodes automatically.

Documentation athttp://www.datast= ax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_node_to_clu= ster_t.html=C2=A0suggests that addition of new node requires cleanup to= be run on all other nodes of the cluster. However, it does not clarify the= procedure in a multi-data center setup.

My understanding is that nodetool cleanup removes data which no longer belo= ngs to that node. When a new data center is being setup, we are creating co= mpletely new replicas and AFAICT, it does not result in data movement/rebal= ance outside of this new data center and hence there is no cleanup requirem= ent on nodes of other data centers. Is someone able to confirm if my unders= tanding is right, and cleanup is not required on nodes of other data center= s?


Thanks

Vish

--089e011609b8c5f13b0500110690--