cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Jirsa <jeff.ji...@crowdstrike.com>
Subject Re: too many full gc in one node of the cluster
Date Fri, 13 Nov 2015 16:11:39 GMT
What Kai Wang is hinting at: one of the common tuning problems people face is they follow the
advice in cassandra-env.sh, which says 100M of young gen space (Xmn) per core. Many people
find that insufficient – raising that to be 30,40, or 50% of heap size (Xmx) MAY help keep
short-lived objects (especially from the read path) in young gen rather than promoted to old
gen, so they’ll be dropped quickly rather than promoted/cleaned with stop-the-world phase
of CMS.

It’s often the case that increasing Xmn and setting XX:MaxTenuringThreshold=6 or 8 will
reduce the number of long CMS pauses. 

- Jeff

From:  Kai Wang
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, November 13, 2015 at 5:35 AM
To:  "user@cassandra.apache.org"
Subject:  Re: too many full gc in one node of the cluster

What's the size of young generation (-Xmn) ?

On Fri, Nov 13, 2015 at 6:38 AM, Jason Wee <peichieh@gmail.com> wrote:
Used to manage/develop for cassandra 1.0.8 for quite sometime. Although 1.0 was rocking stable
but we encountered various problems as load per node grow beyond 500gb. upgrading is one of
the solution but may not be the solution for you but I strongly recommend you upgrade to 1.1
or 1.2. we upgraded the java on the cassandra node and cassandra to 1.1 and a lot of problems
went away.

As for your use cases, a quick solution would probably to just add nodes, or study client
reading pattern so not on a node hot row (the has on the key), or the client configuration
on your application and/or the keyspace replication.

hth,

jason

On Fri, Nov 13, 2015 at 2:35 PM, Shuo Chen <chenatu2006@gmail.com> wrote:
Hi,

We have a small cassandra cluster with 4 nodes for production. All the nodes have similar
hardware configuration and similar data load. The C* version is 1.0.7 (prretty old)

One of the node has much higher cpu usage than others and high full gc frequency, but the
io of this node is not high and data load of this node is even lower. So I have several questions:

1. Is that normal that one of the node having much higher full gc with same jvm configuration?
2. Does this node need special gc tuning and how?
3. How to find the cause of the full gc?

Thank you guys!


The heap size is 8G and max heap size is 16G. The gc config of cassandra-env.sh is default:

JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"

---------------------
I print instance in the gc log:

num     #instances         #bytes  class name
----------------------------------------------
   1:       2982796      238731200  [B
   2:       3889672      186704256  java.nio.HeapByteBuffer
   3:       1749589       55986848  org.apache.cassandra.db.Column
   4:       1803900       43293600  java.util.concurrent.ConcurrentSkipListMap$Node
   5:        859496       20627904  java.util.concurrent.ConcurrentSkipListMap$Index
   6:          5568       18827912  [J
   7:        162630        6505200  java.math.BigInteger
   8:        167572        5716976  [I
   9:        141698        4534336  java.util.concurrent.ConcurrentHashMap$HashEntry
  10:        141505        4528160  com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node
  11:         31491        4376976  <constMethodKlass>
  12:         31491        4291992  <methodKlass>
  13:        171695        4120680  org.apache.cassandra.db.DecoratedKey
  14:          3157        3436120  <constantPoolKlass>
  15:        141784        3402816  java.lang.Long
  16:        141624        3398976  org.apache.cassandra.utils.Pair
  17:        141505        3396120  com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$WeightedValue
  18:         49604        2675352  <symbolKlass>
  19:        162254        2596064  org.apache.cassandra.dht.BigIntegerToken
......
Total      13337798      641834360


----------------------
The gc part and thread status of system log:

INFO [ScheduledTasks:1] 2015-11-13 14:22:08,681 GCInspector.java (line 123) GC for ParNew:
1015 ms for 2 collections, 3886753520 used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:09,683 GCInspector.java (line 123) GC for ParNew:
500 ms for 1 collections, 4956287408 used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:10,685 GCInspector.java (line 123) GC for ParNew:
627 ms for 1 collections, 5615882296used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:12,015 GCInspector.java (line 123) GC for ParNew:
988 ms for 2 collections, 4943363480 used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:13,016 GCInspector.java (line 123) GC for ParNew:
373 ms for 1 collections, 5978572832 used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:14,020 GCInspector.java (line 123) GC for ParNew:
486 ms for 1 collections, 6209638280used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:15,412 GCInspector.java (line 123) GC for ParNew:
898 ms for 2 collections, 6045603728used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:16,413 GCInspector.java (line 123) GC for ParNew:
503 ms for 1 collections, 6991263984 used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:17,416 GCInspector.java (line 123) GC for ParNew:
746 ms for 1 collections, 7073467384used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,363 GCInspector.java (line 123) GC for ConcurrentMarkSweep:
843 ms for 2 collections, 1130423160 used; max is 8231321600
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,364 MessagingService.java (line 603) 4198 READ
messages dropped in last 5000ms
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,364 StatusLogger.java (line 50) Pool Name   
                Active   Pending   Blocked
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,368 StatusLogger.java (line 65) ReadStage   
                    32       450         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,370 StatusLogger.java (line 65) RequestResponseStage
             0        18         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,371 StatusLogger.java (line 65) ReadRepairStage
                  0         3         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,372 StatusLogger.java (line 65) MutationStage
                    2       343         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,373 StatusLogger.java (line 65) ReplicateOnWriteStage
            0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,374 StatusLogger.java (line 65) GossipStage 
                     0         3         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,375 StatusLogger.java (line 65) AntiEntropyStage
                 0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,396 StatusLogger.java (line 65) MigrationStage
                   0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,397 StatusLogger.java (line 65) StreamStage 
                     0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,397 StatusLogger.java (line 65) MemtablePostFlusher
              0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,397 StatusLogger.java (line 65) FlushWriter 
                     0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,398 StatusLogger.java (line 65) MiscStage   
                     0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,398 StatusLogger.java (line 65) InternalResponseStage
            0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,434 StatusLogger.java (line 65) HintedHandoff
                    0         0         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,435 StatusLogger.java (line 69) CompactionManager
              n/a         0
 INFO [ScheduledTasks:1] 2015-11-13 14:22:33,436 StatusLogger.java (line 81) MessagingService
               n/a      0,67

-- 
陈硕 Shuo Chen
chenatu2006@gmail.com
chenshuo@whaty.com




Mime
View raw message