From user-return-38800-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Feb 5 14:06:46 2014 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5EF55100A4 for ; Wed, 5 Feb 2014 14:06:46 +0000 (UTC) Received: (qmail 63289 invoked by uid 500); 5 Feb 2014 14:06:43 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 62968 invoked by uid 500); 5 Feb 2014 14:06:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 62960 invoked by uid 99); 5 Feb 2014 14:06:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Feb 2014 14:06:37 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kwright@nanigans.com designates 216.82.251.1 as permitted sender) Received: from [216.82.251.1] (HELO mail1.bemta12.messagelabs.com) (216.82.251.1) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Feb 2014 14:06:29 +0000 Received: from [216.82.250.115:42645] by server-1.bemta-12.messagelabs.com id 37/C2-16420-E4542F25; Wed, 05 Feb 2014 14:06:06 +0000 X-Env-Sender: kwright@nanigans.com X-Msg-Ref: server-13.tower-127.messagelabs.com!1391609163!5673838!5 X-Originating-IP: [216.166.12.99] X-StarScan-Received: X-StarScan-Version: 6.9.16; banners=-,-,- X-VirusChecked: Checked Received: (qmail 19608 invoked from network); 5 Feb 2014 14:06:05 -0000 Received: from out001.collaborationhost.net (HELO out001.collaborationhost.net) (216.166.12.99) by server-13.tower-127.messagelabs.com with RC4-SHA encrypted SMTP; 5 Feb 2014 14:06:05 -0000 Received: from AUSP01VMBX28.collaborationhost.net ([192.168.20.73]) by AUSP01MHUB08.collaborationhost.net ([10.2.8.243]) with mapi; Wed, 5 Feb 2014 08:06:07 -0600 From: Keith Wright To: "user@cassandra.apache.org" CC: Don Jackson , Dave Carroll Date: Wed, 5 Feb 2014 08:05:00 -0600 Subject: Bootstrap failure Thread-Topic: Bootstrap failure Thread-Index: Ac8ie2qGpx4pF9kST2+DVLLW4chiZQ== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.3.120616 acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_CF17ADAF1F620kwrightnaniganscom_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_CF17ADAF1F620kwrightnaniganscom_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Hi all, We have been struggling with the inability to bootstrap nodes into our = 1.2.13 environment with Vnodes using centos 6.4 with Java 7. We have an 8 = node cluster (32 GB RAM, dual hex core, SSDs, 8 GB heap with 1200 MB eden s= pace, RF3) with around 1 TB per node using murmur3. When we go to bootstr= ap a new node this is what we see: * Bootstrapping node assigns tokens and requests data from cluster * 5-6 nodes within the cluster begin to stream data * Around 2 minutes after bootstrap start, between 1 and 4 nodes (sometim= es the bootstrapping node and sometimes not) become unresponsive in par new= GCs * IF no nodes go down during the first 5 minutes of bootstrap, then the = bootstrap will succeed without issue * GC mired nodes tend to recover after a minute or two but the receiving= node stops attempting to get more data from the nodes * Bootstrap eventually fails (after streaming all the data from nodes th= at did not go down) with Unable to Fetch Ranges We have tried the following and it appears that sometimes a bootstrap will = succeed (perhaps 1 in 10) but with no discernible pattern: * Increase phi_convict to 16 * Restart all nodes prior to bootstrap (to ensure heap is as =93clean=94= as possible) * Stop production load against the cluster (to reduce par new churn); af= ter 5 minutes we know if the bootstrap will succeed so we then re-enable lo= ad * Distribute soft interrupts across all CPUs Below is an output from the GC log of the bootstrapping node when it was st= uck in GC. Has anyone seen this before? This is our production cluster and our inabil= ity to grow is a BLOCKING issue for us. Any ideas would be VERY helpful. Thanks {Heap before GC invocations=3D109 (full 0): par new generation total 1105920K, used 1021140K [0x00000005fae00000, 0x= 0000000645e00000, 0x0000000645e00000) eden space 983040K, 100% used [0x00000005fae00000, 0x0000000636e00000, 0x= 0000000636e00000) from space 122880K, 31% used [0x000000063e600000, 0x0000000640b350f0, 0x= 0000000645e00000) to space 122880K, 0% used [0x0000000636e00000, 0x0000000636e00000, 0x= 000000063e600000) concurrent mark-sweep generation total 7159808K, used 3826815K [0x00000006= 45e00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 24512K, used 24368K [0x00000007fae000= 00, 0x00000007fc5f0000, 0x0000000800000000) 2014-02-05T13:27:49.621+0000: 210.242: [GC 210.242: [ParNew: 1021140K->1228= 80K(1105920K), 0.2963210 secs] 4847955K->4024095K(8265728K), 0.2965270 secs= ] [Times: user=3D4.97 sys=3D0.00, real=3D0.30 secs] Heap after GC invocations=3D110 (full 0): par new generation total 1105920K, used 122880K [0x00000005fae00000, 0x0= 000000645e00000, 0x0000000645e00000) eden space 983040K, 0% used [0x00000005fae00000, 0x00000005fae00000, 0x= 0000000636e00000) from space 122880K, 100% used [0x0000000636e00000, 0x000000063e600000, 0x= 000000063e600000) to space 122880K, 0% used [0x000000063e600000, 0x000000063e600000, 0x= 0000000645e00000) concurrent mark-sweep generation total 7159808K, used 3901215K [0x00000006= 45e00000, 0x00000007fae00000, 0x00000007fae00000) concurrent-mark-sweep perm gen total 24512K, used 24368K [0x00000007fae000= 00, 0x00000007fc5f0000, 0x0000000800000000) } Total time for which application threads were stopped: 0.2968550 seconds Application time: 1.5953840 seconds Total time for which application threads were stopped: 0.0002040 seconds Application time: 0.0000510 seconds --_000_CF17ADAF1F620kwrightnaniganscom_ Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable
Hi all,

    We have been struggling with the inability to bootstrap node= s into our 1.2.13 environment with Vnodes using centos 6.4 with Java 7. &nb= sp;We have an 8 node cluster (32 GB RAM, dual hex core, SSDs, 8 GB heap wit= h 1200 MB eden space, RF3) with around 1 TB per node using murmur3.   = When we go to bootstrap a new node this is what we see:
  • Bootst= rapping node assigns tokens and requests data from cluster
  • 5-6 node= s within the cluster begin to stream data
  • Around 2 minutes after bo= otstrap start, between 1 and 4 nodes (sometimes the bootstrapping node and = sometimes not) become unresponsive in par new GCs
  • IF no nodes go do= wn during the first 5 minutes of bootstrap, then the bootstrap will succeed= without issue
  • GC mired nodes tend to recover after a minute or two= but the receiving node stops attempting to get more data from the nodes&nb= sp;
  • Bootstrap eventually fails (after streaming all the data from n= odes that did not go down) with Unable to Fetch Ranges
We hav= e tried the following and it appears that sometimes a bootstrap will succee= d (perhaps 1 in 10) but with no discernible pattern:
  • Increase = phi_convict to 16
  • Restart all nodes prior to bootstrap (to ensure h= eap is as =93clean=94 as possible)
  • Stop production load against the= cluster (to reduce par new churn); after 5 minutes we know if the bootstra= p will succeed so we then re-enable load
  • Distribute soft interrupts= across all CPUs
Below is an output from the GC log of the bo= otstrapping node when it was stuck in GC.

Has anyo= ne seen this before?  This is our production cluster and our inability= to grow is a BLOCKING issue for us.  Any ideas would be VERY helpful.=

Thanks

{Heap before GC invocations= =3D109 (full 0):

 par new generation   total 1105920K, used 1021140K [0x00= 000005fae00000, 0x0000000645e00000, 0x0000000645e00000)

  eden space 983040K, 1= 00% used [0x00000005fae00000, 0x0000000636e00000, 0x0000000636e00000)

  from sp= ace 122880K,  31% used [0x000000063e600000, 0x0000000640b350f0, 0x0000= 000645e00000)

  to   space 122880K,   0% used [0x0000000636e00000, 0x= 0000000636e00000, 0x000000063e600000)

 concurrent mark-sweep generation total 7= 159808K, used 3826815K [0x0000000645e00000, 0x00000007fae00000, 0x00000007f= ae00000)

=  concurrent-mark-sweep perm gen total 24512K, used 24368K [0x00000007f= ae00000, 0x00000007fc5f0000, 0x0000000800000000)

2014-02-05T13:27:49.621+0000: 2= 10.242: [GC 210.242: [ParNew: 1021140K->122880K(1105920K), 0.2963210 sec= s] 4847955K->4024095K(8265728K), 0.2965270 secs] [Times: user=3D4.97 sys= =3D0.00, real=3D0.30 secs] 

Heap after GC invocations=3D110 (full 0):

 par new gene= ration   total 1105920K, used 122880K [0x00000005fae00000, 0x000000064= 5e00000, 0x0000000645e00000)

  eden space 983040K,   0% used [0x00000005fa= e00000, 0x00000005fae00000, 0x0000000636e00000)

  from space 122880K, 100% used= [0x0000000636e00000, 0x000000063e600000, 0x000000063e600000)

  to   spa= ce 122880K,   0% used [0x000000063e600000, 0x000000063e600000, 0x00000= 00645e00000)

 concurrent mark-sweep generation total 7159808K, used 3901215K [0= x0000000645e00000, 0x00000007fae00000, 0x00000007fae00000)

 concurrent-mark-swe= ep perm gen total 24512K, used 24368K [0x00000007fae00000, 0x00000007fc5f00= 00, 0x0000000800000000)

}

Total time for which application threads were stopped: 0.2968550 se= conds

App= lication time: 1.5953840 seconds

Total time for which application threads were stopp= ed: 0.0002040 seconds

Application time: 0.0000510 seconds

--_000_CF17ADAF1F620kwrightnaniganscom_--