Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D7F53CDC5 for ; Fri, 1 Nov 2013 16:39:57 +0000 (UTC) Received: (qmail 81633 invoked by uid 500); 1 Nov 2013 16:37:04 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 81517 invoked by uid 500); 1 Nov 2013 16:37:01 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 81449 invoked by uid 99); 1 Nov 2013 16:36:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Nov 2013 16:36:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of narendra.sharma@gmail.com designates 209.85.223.169 as permitted sender) Received: from [209.85.223.169] (HELO mail-ie0-f169.google.com) (209.85.223.169) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Nov 2013 16:36:51 +0000 Received: by mail-ie0-f169.google.com with SMTP id ar20so8042675iec.0 for ; Fri, 01 Nov 2013 09:36:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=2fiDEDJrrSGEH651QsxfhPQhV+biNF5s0kBC/ImlgfA=; b=0/sUULIf0rHaXDBbYF4uI6TcfcEz3ZO1x/wG5TLlt6lgwi5pvswt67R6mltAJ/s59X PDmCCzYOgK3s3YSU58dPVVoIlVXXMxPGxbWH/9R/THcHm0ri4zbB+UG7SVFJ4yCl/u4B P3R+ZLdpjiLKd0o+hWQL6eGVLSXVvpPAfev9OO71rRQDXgRsdT82j1Dpvmq5vI4FGvpk aAuASk7zEdcSTKgAkJbLZgbK0smGeUwvy0Xo1vBdDwvPGcLr13zjXjVbenQUWYYS3mF4 bMAbZc6QOrCYmteru4ukbg9t6YbvxV1sIa8+8IvSOq6kP2881+chlo8973Nl6rcxz/1O H0qw== MIME-Version: 1.0 X-Received: by 10.50.77.83 with SMTP id q19mr2951669igw.21.1383323790180; Fri, 01 Nov 2013 09:36:30 -0700 (PDT) Received: by 10.50.93.67 with HTTP; Fri, 1 Nov 2013 09:36:30 -0700 (PDT) In-Reply-To: References: Date: Fri, 1 Nov 2013 22:06:30 +0530 Message-ID: Subject: Re: Cassandra 1.1.6 - New node bootstrap not completing From: Narendra Sharma To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7bdca5041f891104ea202b78 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bdca5041f891104ea202b78 Content-Type: text/plain; charset=ISO-8859-1 I was successfully able to bootstrap the node. The issue was RF > 2. Thanks again Robert. On Wed, Oct 30, 2013 at 10:29 AM, Narendra Sharma wrote: > Thanks Robert. > > I didn't realize that some of the keyspaces (not all and esp. the biggest > one I was focusing on) had RF > 2. I wasted 3 days on it. Thanks again for > the pointers. I will try again and share the results. > > > On Wed, Oct 30, 2013 at 12:28 AM, Robert Coli wrote: > >> On Tue, Oct 29, 2013 at 11:45 AM, Narendra Sharma < >> narendra.sharma@gmail.com> wrote: >> >>> We had a cluster of 4 nodes in AWS. The average load on each node was >>> approx 750GB. We added 4 new nodes. It is now more than 30 hours and the >>> node is still in JOINING mode. >>> Specifically I am analyzing the one with IP 10.3.1.29. There is no >>> compaction or streaming or index building happening. >>> >> >> If your cluster has RF>2, you are bootstrapping two nodes into the same >> range simultaneously. That is not supported. [1,2] The node you are having >> the problem with is in the range that is probably overlapping. >> >> If I were you I would : >> >> 1) stop all "Joining" nodes and wipe their state including system keyspace >> 2) optionally "removetoken" any nodes which remain in cluster gossip >> state after stopping >> 3) re-start/bootstrap them one at a time, waiting for each to complete >> bootstrapping before starting the next one >> 4) (unrelated) Upgrade from 1.1.6 to the head of 1.1.x ASAP. >> >> =Rob >> [1] https://issues.apache.org/jira/browse/CASSANDRA-2434 >> [2] >> https://issues.apache.org/jira/browse/CASSANDRA-2434?focusedCommentId=13091851&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13091851 >> > > > > -- > Narendra Sharma > Software Engineer > *http://www.aeris.com* > *http://narendrasharma.blogspot.com/* > > -- Narendra Sharma Software Engineer *http://www.aeris.com* *http://narendrasharma.blogspot.com/* --047d7bdca5041f891104ea202b78 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I was successfully able to bootstrap the node. The issue w= as RF > 2. Thanks again Robert.


=
On Wed, Oct 30, 2013 at 10:29 AM, Narendra Sharm= a <narendra.sharma@gmail.com> wrote:
Thanks Robert.

I didn't realize that some of the keyspaces (not all and esp. th= e biggest one I was focusing on) had RF > 2. I wasted 3 days on it. Than= ks again for the pointers. I will try again and share the results.


On Wed, Oct 30, 2013 at 12:28 AM, Robert Coli <rcoli@e= ventbrite.com> wrote:
On Tue, Oct 29, 2013 a= t 11:45 AM, Narendra Sharma <narendra.sharma@gmail.com> wrote:
We had a cluster of 4 nodes in AWS. The a= verage load on each node was approx 750GB. We added 4 new nodes. It is now = more than 30 hours and the node is still in JOINING mode.
Specifically I am analyzing the one with IP 10.3.1.29. There is no compacti= on or streaming or index building happening.=A0

If your cluster has RF>2, you are bootstrapping t= wo nodes into the same range simultaneously. That is not supported. [1,2] T= he node you are having the problem with is in the range that is probably ov= erlapping.

If I were you I would :

1) sto= p all "Joining" nodes and wipe their state including system keysp= ace
2) optionally "removetoken" any nodes which remain = in cluster gossip state after stopping
3) re-start/bootstrap them one at a time, waiting for each to complete= bootstrapping before starting the next =A0one
4) (unrelated) Upg= rade from 1.1.6 to the head of 1.1.x ASAP.







--
Narendra Sharma
--047d7bdca5041f891104ea202b78--