cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brent Haines (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-8723) Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until process is killed by OOM killer
Date Mon, 09 Feb 2015 06:56:35 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311864#comment-14311864
] 

Brent Haines edited comment on CASSANDRA-8723 at 2/9/15 6:55 AM:
-----------------------------------------------------------------

[~jeffl] I ran 

{code} watch -n 10 'nodetool compactionstats' {code}

on the effected node and watched it for awhile. For us it would always end up on the same
compaction, of the same CF where it would get stuck until the OOM happened. The stats on the
compaction give you a hint -- the total number of bytes are the same each time, then it will
get some portion of the way through the compaction when progress freezes and eventually the
system runs OOM.

We have the standard replication factor of 3 so it was no big deal to stop cassandra, delete
the node's storage of that CF and then restart and run repair. Care must be taken, obviously,
but it did recover steady state for us on 3 separate incidents. Once it's fixed on a node,
we haven't had issues return for that node.




was (Author: thebrenthaines):
[~jeffl] I ran 

{code} watch -n 10 'nodetool compactionstats' {code}

on the effected node and watch it for awhile. For us it would always end up on the same compaction,
of the same CF where it would get stuck until the OOM happened. The stats on the compaction
give you a hint -- the total number of bytes are the same each time, then it will get some
portion of the way through the compaction when progress freezes and eventually the system
runs OOM.

We have the standard replication factor of 3 so it was no big deal to stop cassandra, delete
the node's storage of that CF and then restart and run repair. Care must be taken, obviously,
but it did recover steady state for us on 3 separate incidents. Once it's fixed no a node,
we haven't had issues return for that node.



> Cassandra 2.1.2 Memory issue - java process memory usage continuously increases until
process is killed by OOM killer
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8723
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8723
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jeff Liu
>             Fix For: 2.1.3
>
>         Attachments: cassandra.yaml
>
>
> Issue:
> We have an on-going issue with cassandra nodes running with continuously increasing memory
until killed by OOM.
> {noformat}
> Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783481] Out of memory: Kill process 13919
(java) score 911 or sacrifice child
> Jan 29 10:15:41 cass-chisel19 kernel: [24533109.783557] Killed process 13919 (java) total-vm:18366340kB,
anon-rss:6461472kB, file-rss:6684kB
> {noformat}
> System Profile:
> cassandra version 2.1.2
> system: aws c1.xlarge instance with 8 cores, 7.1G memory.
> cassandra jvm:
> -Xms1792M -Xmx1792M -Xmn400M -Xss256k
> {noformat}
> java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42 -Xms1792M -Xmx1792M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k
-XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -XX:+UseCondCardMark -XX:+PrintGCDetails -XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure
-Xloggc:/var/log/cassandra/gc-1421511249.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5
-XX:GCLogFileSize=48M -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199
-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false
-javaagent:/usr/share/java/graphite-reporter-agent-1.0-SNAPSHOT.jar=graphiteServer=metrics-a.hq.nest.com;graphitePort=2003;graphitePollInt=60
-Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir=
-Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-16.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.8.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.0.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/logback-classic-1.1.2.jar:/usr/share/cassandra/lib/logback-core-1.1.2.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/metrics-graphite-2.2.0.jar:/usr/share/cassandra/lib/mx4j-tools.jar:/usr/share/cassandra/lib/netty-all-4.0.23.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.2.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/stringtemplate-4.0.2.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-2.1.2.jar:/usr/share/cassandra/apache-cassandra-thrift-2.1.2.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/cassandra-driver-core-2.0.5.jar:/usr/share/cassandra/netty-3.9.0.Final.jar:/usr/share/cassandra/stress.jar:
-XX:HeapDumpPath=/var/lib/cassandra/java_1421511248.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1421511248.log
org.apache.cassandra.service.CassandraDaemon
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message