cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Lohfink <clohf...@blackbirdit.com>
Subject Re: Trying to understand cassandra gc logs
Date Tue, 16 Sep 2014 14:58:39 GMT
Check out:

https://blogs.oracle.com/poonam/entry/understanding_cms_gc_logs

The young gen collection is stop the world that pauses application threads, and a couple parts
of CMS can as well.  I would recommend disabling the 

#JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1"

line in your cassandra-env.sh as well to simplify things a little and make it parsable by
gc log visualization tools

---
Chris Lohfink

On Sep 15, 2014, at 9:40 PM, Donald Smith <Donald.Smith@audiencescience.com> wrote:

> I understand that cassandra uses ParNew GC for New Gen and CMS for Old Gen (tenured).
  I’m trying to interpret in the logs when a Full GC happens and what kind of Full GC is
used.  It never says “Full GC” or anything like that. But I see that whenever there’s
a line like
>  
>     2014-09-15T18:04:17.197-0700: 117485.192: [CMS-concurrent-mark-start]
>  
> the count of full GCs increases from
>  
> {Heap after GC invocations=158459 (full 931):
>  
> to a line like:
>  
> {Heap before GC invocations=158459 (full 932):
>  
> See the highlighted lines in the gclog output below.  So, apparently there was a full
GC between those two lines. Between those lines it also has two lines, such as:
>  
>    2014-09-15T18:04:17.197-0700: 117485.192: Total time for which application threads
were stopped: 0.0362080 seconds
>    2014-09-15T18:04:17.882-0700: 117485.877: Total time for which application threads
were stopped: 0.0129660 seconds
>  
> Also, the full count (932 above) is always exactly half the number (1864) FGC returned
by jstat, as in
>  
> dc1-cassandra01.dc01 /var/log/cassandra> sudo jstat -gcutil 28511
>   S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
> 55.82   0.00  82.45  45.02  59.76 165772 5129.728  1864  320.247 5449.975
>  
> So, I am apparently correct that “(full 932)” is the count of Full GCs. I’m perplexed
by the log output, though.
>  
> I also see lines mentioning “concurrent mark-sweep” that do not appear to correspond
to full GCs. So, my questions are:  Is CMS used also for full GCs? If not, what kind of gc
is done? The logs don’t say.    Lines saying “Total time for which application threads
were stopped” appear twice per full gc; why?  Apparently, even our Full GCs are fast. 99%
of them finish within 0.18  seconds; 99.9% finish within 0.5 seconds (which may be too slow
for some of our clients).
>  
> Here below is some log output, with interesting parts highlighted in grey or yellow.
 Thanks, Don
>  
> {Heap before GC invocations=158458 (full 931):
> par new generation   total 1290240K, used 1213281K [0x00000005bae00000, 0x0000000612600000,
0x0000000612600000)
>   eden space 1146880K, 100% used [0x00000005bae00000, 0x0000000600e00000, 0x0000000600e00000)
>   from space 143360K,  46% used [0x0000000600e00000, 0x0000000604ed87c0, 0x0000000609a00000)
>   to   space 143360K,   0% used [0x0000000609a00000, 0x0000000609a00000, 0x0000000612600000)
> concurrent mark-sweep generation total 8003584K, used 5983572K [0x0000000612600000, 0x00000007fae00000,
0x00000007fae00000)
> concurrent-mark-sweep perm gen total 44820K, used 26890K [0x00000007fae00000, 0x00000007fd9c5000,
0x0000000800000000)
> 2014-09-15T18:04:17.131-0700: 117485.127: [GCBefore GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 197474318
> Max   Chunk Size: 160662270
> Number of Blocks: 3095
> Av.  Block  Size: 63804
> Tree      Height: 32
> Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 2285026
> Max   Chunk Size: 2279936
> Number of Blocks: 8
> Av.  Block  Size: 285628
> Tree      Height: 5
> 2014-09-15T18:04:17.133-0700: 117485.128: [ParNew
> Desired survivor size 73400320 bytes, new threshold 1 (max 1)
> - age   1:   44548776 bytes,   44548776 total
> : 1213281K->49867K(1290240K), 0.0264540 secs] 7196854K->6059170K(9293824K)After
GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 195160244
> Max   Chunk Size: 160662270
> Number of Blocks: 3093
> Av.  Block  Size: 63097
> Tree      Height: 32
> After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 2285026
> Max   Chunk Size: 2279936
> Number of Blocks: 8
> Av.  Block  Size: 285628
> Tree      Height: 5
> , 0.0286700 secs] [Times: user=0.37 sys=0.01, real=0.03 secs]
> Heap after GC invocations=158459 (full 931):
> par new generation   total 1290240K, used 49867K [0x00000005bae00000, 0x0000000612600000,
0x0000000612600000)
>   eden space 1146880K,   0% used [0x00000005bae00000, 0x00000005bae00000, 0x0000000600e00000)
>   from space 143360K,  34% used [0x0000000609a00000, 0x000000060cab2e18, 0x0000000612600000)
>   to   space 143360K,   0% used [0x0000000600e00000, 0x0000000600e00000, 0x0000000609a00000)
> concurrent mark-sweep generation total 8003584K, used 6009302K [0x0000000612600000, 0x00000007fae00000,
0x00000007fae00000)
> concurrent-mark-sweep perm gen total 44820K, used 26890K [0x00000007fae00000, 0x00000007fd9c5000,
0x0000000800000000)
> }
> 2014-09-15T18:04:17.161-0700: 117485.156: Total time for which application threads were
stopped: 0.0421350 seconds
> 2014-09-15T18:04:17.173-0700: 117485.168: [GC [1 CMS-initial-mark: 6009302K(8003584K)]
6059194K(9293824K), 0.0231840 secs] [Times: user=0.03 sys=0.00, real=0.03 secs]
> 2014-09-15T18:04:17.197-0700: 117485.192: Total time for which application threads were
stopped: 0.0362080 seconds
> 2014-09-15T18:04:17.197-0700: 117485.192: [CMS-concurrent-mark-start]
> 2014-09-15T18:04:17.681-0700: 117485.677: [CMS-concurrent-mark: 0.484/0.484 secs] [Times:
user=6.42 sys=1.23, real=0.48 secs]
> 2014-09-15T18:04:17.682-0700: 117485.677: [CMS-concurrent-preclean-start]
> 2014-09-15T18:04:17.739-0700: 117485.734: [CMS-concurrent-preclean: 0.054/0.057 secs]
[Times: user=0.58 sys=0.17, real=0.06 secs]
> 2014-09-15T18:04:17.739-0700: 117485.734: [CMS-concurrent-abortable-preclean-start]
> 2014-09-15T18:04:17.882-0700: 117485.877: Total time for which application threads were
stopped: 0.0129660 seconds
> {Heap before GC invocations=158459 (full 932):  
> par new generation   total 1290240K, used 1196755K [0x00000005bae00000, 0x0000000612600000,
0x0000000612600000)
>   eden space 1146880K, 100% used [0x00000005bae00000, 0x0000000600e00000, 0x0000000600e00000)
>   from space 143360K,  34% used [0x0000000609a00000, 0x000000060cab4c50, 0x0000000612600000)
>   to   space 143360K,   0% used [0x0000000600e00000, 0x0000000600e00000, 0x0000000609a00000)
> concurrent mark-sweep generation total 8003584K, used 6009302K [0x0000000612600000, 0x00000007fae00000,
0x00000007fae00000)
> concurrent-mark-sweep perm gen total 44820K, used 26890K [0x00000007fae00000, 0x00000007fd9c5000,
0x0000000800000000)
> 2014-09-15T18:04:17.894-0700: 117485.889: [GCBefore GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 195160244
> Max   Chunk Size: 160662270
> Number of Blocks: 3093
> Av.  Block  Size: 63097
> Tree      Height: 32
> Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 2285026
> Max   Chunk Size: 2279936
> Number of Blocks: 8
> Av.  Block  Size: 285628
> Tree      Height: 5
> 2014-09-15T18:04:17.895-0700: 117485.890: [ParNew
> Desired survivor size 73400320 bytes, new threshold 1 (max 1)
> - age   1:   60440528 bytes,   60440528 total
> : 1196755K->67320K(1290240K), 0.0273840 secs] 7206057K->6103334K(9293824K)After
GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 192632750
> Max   Chunk Size: 160662270
> Number of Blocks: 3091
> Av.  Block  Size: 62320
> Tree      Height: 32
> After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 2285026
> Max   Chunk Size: 2279936
> Number of Blocks: 8
> Av.  Block  Size: 285628
> Tree      Height: 5
> , 0.0296010 secs] [Times: user=0.41 sys=0.00, real=0.03 secs]
> Heap after GC invocations=158460 (full 932):
> par new generation   total 1290240K, used 67320K [0x00000005bae00000, 0x0000000612600000,
0x0000000612600000)
>   eden space 1146880K,   0% used [0x00000005bae00000, 0x00000005bae00000, 0x0000000600e00000)
>   from space 143360K,  46% used [0x0000000600e00000, 0x0000000604fbe050, 0x0000000609a00000)
>   to   space 143360K,   0% used [0x0000000609a00000, 0x0000000609a00000, 0x0000000612600000)
> concurrent mark-sweep generation total 8003584K, used 6036014K [0x0000000612600000, 0x00000007fae00000,
0x00000007fae00000)
> concurrent-mark-sweep perm gen total 44820K, used 26890K [0x00000007fae00000, 0x00000007fd9c5000,
0x0000000800000000)
> }
> 2014-09-15T18:04:17.924-0700: 117485.919: Total time for which application threads were
stopped: 0.0425360 seconds
>  
>  
> Here’s our java commandline:
> 
> 499      28511     1 99 Sep14 ?        10-23:19:02 /usr/java/jdk-1.7.0-u60/bin/java -ea
-javaagent:/usr/share/cassandra//lib/jamm-0.2.5.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42 -Xms9G -Xmx9G -Xmn1400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k
-XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseTLAB -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure
-XX:PrintFLSStatistics=1 -Xloggc:/var/log/cassandra/gc-1410711971.log -Xloggc:/var/log/cassandra/gc.log
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -Djava.net.preferIPv4Stack=true
-Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false
-Dcassandra.metricsReporterConfigFile=metrics-reporter-config.yaml -Dlog4j.configuration=log4j-server.properties
-Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp
/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-2.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-2.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-2.0.9.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-15.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-ganglia-2.2.0.jar:/usr/share/cassandra/lib/netty-3.6.6.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/lib/stress.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-internal-only-0.3.3.jar
org.apache.cassandra.service.CassandraDaemon
>  
> Donald A. Smith | Senior Software Engineer 
> P: 425.201.3900 x 3866
> C: (206) 819-5965
> F: (646) 443-2333
> donalds@AudienceScience.com
> 
> <image001.jpg>


Mime
View raw message