Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EF4199FE9 for ; Fri, 30 Mar 2012 23:48:38 +0000 (UTC) Received: (qmail 51797 invoked by uid 500); 30 Mar 2012 23:48:36 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 51774 invoked by uid 500); 30 Mar 2012 23:48:36 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 51766 invoked by uid 99); 30 Mar 2012 23:48:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Mar 2012 23:48:36 +0000 X-ASF-Spam-Status: No, hits=3.3 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,TRACKER_ID X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a52.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Mar 2012 23:48:29 +0000 Received: from homiemail-a52.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a52.g.dreamhost.com (Postfix) with ESMTP id A590F6B80F9 for ; Fri, 30 Mar 2012 16:48:06 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=xkPmr8kEct L7G1psXCeaWjpqdKpxIOWtNEn4o3mM7Fs0H/j9HJAODsR0/9WjNIpaGFikeMH5U2 D7MI3HuSBRgg+FB61ixpo+ADMXcdsD1gu0uETZHDd+hHAUtz7X2ppvNvueOLWZVS vrIf8y8reW7nKliUmLFma2nSaerl2ow+g= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=1LrgfT6oQw9Rasz5 IFTXAObb20g=; b=pzA01OWF9jONeb6D1YuMERijsKf3Fv0N82xS6/KKvXVQymLk JpIjYLvREzveJ4FcrzMrA0Pr23hEcEWI35238PoKkX7PTffAt0sEt4W8mPOcPOVt RomdPn9yk/uKP6OsXmVSGfHAI1aVqzNcigtP1YrOcnaMgABOiDpj+WkbMuQ= Received: from [172.20.10.3] (41.157.224.49.dyn.cust.vf.net.nz [49.224.157.41]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a52.g.dreamhost.com (Postfix) with ESMTPSA id BF41A6B80ED for ; Fri, 30 Mar 2012 16:48:02 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1257) Content-Type: multipart/alternative; boundary="Apple-Mail=_21EE9627-BCBC-4B40-880A-C8A6546DA8A3" Subject: Re: nodetool ring runs very slow Date: Sat, 31 Mar 2012 12:48:01 +1300 In-Reply-To: <1332867826.19668.YahooMailNeo@web31816.mail.mud.yahoo.com> To: user@cassandra.apache.org References: <1329947389.99350.YahooMailNeo@web31804.mail.mud.yahoo.com> <1330102956.62969.YahooMailNeo@web31811.mail.mud.yahoo.com> <1332867826.19668.YahooMailNeo@web31816.mail.mud.yahoo.com> Message-Id: X-Mailer: Apple Mail (2.1257) --Apple-Mail=_21EE9627-BCBC-4B40-880A-C8A6546DA8A3 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 You node is under significant memory pressure.=20 Look into: * reducing caches * reducing the JVM heap to 8GB Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/03/2012, at 6:03 AM, Feng Qu wrote: > Hi Jonathan, similar problem happens again and there is only one GC = running at that time per system.log.=20 >=20 > This is one node of a 6-node 0.8.6 ring. Heap size on this host is = 16GB. >=20 > [fengqu@slcdbx1035 cassandra]$ date; time cnt ring > Tue Mar 27 09:24:20 GMT+7 2012 > Address DC Rack Status State Load = Owns Token =20 > = 141784319550391026443072753096570088106 =20 > 10.89.74.60 slc rack0 Up Normal 343.97 GB = 16.67% 0 =20 > 10.2.128.55 phx rack0 Up Normal 485.48 GB = 16.67% 28356863910078205288614550619314017621 =20 > 10.89.74.67 slc rack0 Up Normal 252.82 GB = 16.67% 56713727820156410577229101238628035242 =20 > 10.2.128.56 phx rack0 Up Normal 258.49 GB = 16.67% 85070591730234615865843651857942052864 =20 > 10.89.74.62 slc rack0 Up Normal 251.02 GB = 16.67% 113427455640312821154458202477256070485 =20 > 10.2.128.57 phx rack0 Up Normal 451.97 GB = 16.67% 141784319550391026443072753096570088106 =20 > =20 > real 0m25.999s > user 0m0.309s > sys 0m0.048s >=20 > WARN [ScheduledTasks:1] 2012-03-27 09:22:39,499 GCInspector.java = (line 143) Heap is 0.8716646568744459 full. You may need to reduce = memtable and/or cache sizes. Cassandra will now flush up to the two = largest memtables to free up memory. Adjust flush_largest_memtables_at = threshold in cassandra.yaml if you don't want Cassandra to do this = automatically > INFO [ScheduledTasks:1] 2012-03-27 09:23:21,637 GCInspector.java = (line 122) GC for ConcurrentMarkSweep: 1383 ms for 2 collections, = 14756794512 used; max is 16928210944 > WARN [ScheduledTasks:1] 2012-03-27 09:23:21,637 GCInspector.java = (line 143) Heap is 0.8717279434204102 full. You may need to reduce = memtable and/or cache sizes. Cassandra will now flush up to the two = largest memtables to free up memory. Adjust flush_largest_memtables_at = threshold in cassandra.yaml if you don't want Cassandra to do this = automatically > INFO [ScheduledTasks:1] 2012-03-27 09:24:04,844 GCInspector.java = (line 122) GC for ConcurrentMarkSweep: 3090 ms for 2 collections, = 14782314472 used; max is 16928210944 > WARN [ScheduledTasks:1] 2012-03-27 09:24:04,851 GCInspector.java = (line 143) Heap is 0.8732354837083013 full. You may need to reduce = memtable and/or cache sizes. Cassandra will now flush up to the two = largest memtables to free up memory. Adjust flush_largest_memtables_at = threshold in cassandra.yaml if you don't want Cassandra to do this = automatically > INFO [ScheduledTasks:1] 2012-03-27 09:24:46,610 GCInspector.java = (line 122) GC for ConcurrentMarkSweep: 3057 ms for 2 collections, = 14757982328 used; max is 16928210944 > WARN [ScheduledTasks:1] 2012-03-27 09:24:46,616 GCInspector.java = (line 143) Heap is 0.871798111260587 full. You may need to reduce = memtable and/or cache sizes. Cassandra will now flush up to the two = largest memtables to free up memory. Adjust flush_largest_memtables_at = threshold in cassandra.yaml if you don't want Cassandra to do this = automatically > INFO [ScheduledTasks:1] 2012-03-27 09:25:36,371 GCInspector.java = (line 122) GC for ConcurrentMarkSweep: 7430 ms for 2 collections, = 14719911032 used; max is 16928210944 > WARN [ScheduledTasks:1] 2012-03-27 09:25:36,377 GCInspector.java = (line 143) Heap is 0.8695491260532345 full. You may need to reduce = memtable and/or cache sizes. Cassandra will now flush up to the two = largest memtables to free up memory. Adjust flush_largest_memtables_at = threshold in cassandra.yaml if you don't want Cassandra to do this = automatically > =20 > Feng Qu > From: Jonathan Ellis > To: user =20 > Cc: Feng Qu =20 > Sent: Friday, February 24, 2012 2:29 PM > Subject: Re: nodetool ring runs very slow >=20 > Read the server log and look for GCInspector output. >=20 > On Fri, Feb 24, 2012 at 11:02 AM, Feng Qu wrote: > > Hi Jonathan, how to check out whether it's in GC storming? This = server > > crashed few time due to Java heap out of memory. We use 8GB heap on = a server > > with 96GB ram. This is first node in a 6-node ring and it has = opscenter > > community running on it. > > > > Feng Qu > > > > ________________________________ > > From: Jonathan Ellis > > To: user@cassandra.apache.org; Feng Qu > > Sent: Thursday, February 23, 2012 1:19 PM > > Subject: Re: nodetool ring runs very slow > > > > The only time I've seen nodetool be that slow is when it was talking > > to a machine that was either swapping or deep into (JVM) GC = storming. > > > > On Wed, Feb 22, 2012 at 3:49 PM, Feng Qu wrote: > >> We noticed that nodetool ring sometimes returns in 17-20 sec while = it > >> normally runs in less than a sec. There were some compaction = running when > >> it > >> happened. Did compaction cause nodetool slowness? Anything else I = should > >> check? > >>>>> time nodetool -h hostname ring > >> .... > >> real 0m17.595s > >> user 0m0.339s > >> sys 0m0.054s > >> > >> Feng Qu > > > > > > > > -- > > Jonathan Ellis > > Project Chair, Apache Cassandra > > co-founder of DataStax, the source for professional Cassandra = support > > http://www.datastax.com > > > > >=20 >=20 >=20 > --=20 > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >=20 >=20 --Apple-Mail=_21EE9627-BCBC-4B40-880A-C8A6546DA8A3 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 You = node is under significant memory pressure. 

Look = into:
* reducing caches
* reducing the JVM heap to = 8GB

Cheers

http://www.thelastpickle.com

On 28/03/2012, at 6:03 AM, Feng Qu wrote:

Hi Jonathan, similar problem = happens again and there is only one GC running at that time per = system.log. 


[fengqu@slcdbx1035 cassandra]$ date; time cnt = ring
Tue Mar 27 09:24:20 GMT+7 2012
Address         DC          Rack        Status State   Load            Owns    = Token           &nb= sp;            = ;            &= nbsp; 
          = ;            &= nbsp;           &nb= sp;            = ;            &= nbsp;           &nb= sp;       14178431955039102644307275309= 6570088106    
10.89.74.60     slc         rack0       Up     Normal  343.97 GB       16.67%  = 0            &= nbsp;           &nb= sp;            = ;     
10.2.128.55     phx         rack0       Up     Normal  485.48 GB       16.67%  28356863910078205288614550619314017621      =
10.89.74.67     slc         rack0       Up     Normal  252.82 GB       16.67%  56713727820156410577229101238628035242      =
10.2.128.56     phx         rack0       Up     Normal  258.49 GB       16.67%  85070591730234615865843651857942052864      =
10.89.74.62     = slc         rack0       Up     Normal  251.02 GB       16.67%  = 113427455640312821154458202477256070485    
10.2.128.57     = phx         rack0       Up     Normal  451.97 GB       16.67%  141784319550391026443072753096570088106     =
 
real  &nbs= p; 0m25.999s
user    0m0.309s
sys     = 0m0.048s

  WARN = [ScheduledTasks:1] 2012-03-27 09:22:39,499 GCInspector.java (line 143) = Heap is 0.8716646568744459 full.  You may need to reduce memtable = and/or cache sizes.  Cassandra will now flush up to the two largest = memtables to free up memory.  Adjust flush_largest_memtables_at = threshold in cassandra.yaml if you don't want Cassandra to do this = automatically
 INFO [ScheduledTasks:1] 2012-03-27 09:23:21,637 = GCInspector.java (line 122) GC for ConcurrentMarkSweep: 1383 ms for 2 = collections, 14756794512 used; max is 16928210944
 WARN [ScheduledTasks:1] 2012-03-27 = 09:23:21,637 GCInspector.java (line 143) Heap is 0.8717279434204102 = full.  You may need to reduce memtable and/or cache sizes. =  Cassandra will now flush up to the two largest memtables to free = up memory.  Adjust flush_largest_memtables_at threshold in = cassandra.yaml if you don't want Cassandra to do this = automatically
 INFO [ScheduledTasks:1] = 2012-03-27 09:24:04,844 GCInspector.java (line 122) GC for = ConcurrentMarkSweep: 3090 ms for 2 collections, 14782314472 used; max is = 16928210944
 WARN [ScheduledTasks:1] = 2012-03-27 09:24:04,851 GCInspector.java (line 143) Heap is 0.8732354837083013 full.  You = may need to reduce memtable and/or cache sizes.  Cassandra will now = flush up to the two largest memtables to free up memory.  Adjust = flush_largest_memtables_at threshold in cassandra.yaml if you don't want = Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2012-03-27 09:24:46,610 = GCInspector.java (line 122) GC for ConcurrentMarkSweep: 3057 ms for 2 = collections, 14757982328 used; max is 16928210944
 WARN [ScheduledTasks:1] 2012-03-27 09:24:46,616 = GCInspector.java (line 143) Heap is 0.871798111260587 full.  You = may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the = two largest memtables to free up memory.  Adjust = flush_largest_memtables_at threshold in cassandra.yaml if you don't want = Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2012-03-27 09:25:36,371 = GCInspector.java (line 122) GC for ConcurrentMarkSweep: 7430 ms for 2 = collections, 14719911032 used; max is 16928210944
 WARN [ScheduledTasks:1] = 2012-03-27 09:25:36,377 GCInspector.java (line 143) Heap is = 0.8695491260532345 full.  You may need to reduce memtable and/or = cache sizes.  Cassandra will now flush up to the two largest = memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't = want Cassandra to do this = automatically
 
=
=
From: = Jonathan Ellis <jbellis@gmail.com>
To: user <user@cassandra.apache.org>= ;
Cc: Feng Qu = <mail4qf@gmail.com>
= Sent: Friday, February 24, 2012 2:29 PM
Subject: Re: nodetool ring runs = very slow

Read the server log and look for GCInspector output.

On Fri, Feb = 24, 2012 at 11:02 AM, Feng Qu <mail4qf@gmail.com> = wrote:
> Hi Jonathan, how to check out whether it's in GC = storming? This server
> crashed few time due to Java heap out of = memory. We use 8GB heap on a server
> with 96GB ram. This is first = node in a 6-node ring and it has opscenter
> community running on = it.
>
> Feng Qu
>
> = ________________________________
> From: Jonathan Ellis <jbellis@gmail.com>
> To: = user@cassandra.apache.org; = Feng Qu <mail4qf@gmail.com>
> = Sent: Thursday, February 23, 2012 1:19 PM
> Subject: Re: nodetool ring runs very = slow
>
> The only time I've seen nodetool be that slow is = when it was talking
> to a machine that was either swapping or = deep into (JVM) GC storming.
>
> On Wed, Feb 22, 2012 at = 3:49 PM, Feng Qu <mail4qf@gmail.com> = wrote:
>> We noticed that nodetool ring sometimes returns in = 17-20 sec while it
>> normally runs in less than a sec. There = were some compaction running when
>> it
>> happened. = Did compaction cause nodetool slowness? Anything else I = should
>> check?
>>>>> time nodetool = -h hostname ring
>> ....
>> real = 0m17.595s
>> user 0m0.339s
>> sys = 0m0.054s
>>
>> Feng Qu
>
>
>
> = --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional = Cassandra support
> http://www.datastax.com
>
&g= t;



--
Jonathan Ellis
Project Chair, Apache = Cassandra
co-founder of DataStax, the source for professional = Cassandra support
http://www.datastax.com


=
=
= --Apple-Mail=_21EE9627-BCBC-4B40-880A-C8A6546DA8A3--