cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacek Furmankiewicz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7361) Cassandra locks up in full GC when you assign the entire heap to row cache
Date Fri, 06 Jun 2014 22:52:01 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020491#comment-14020491
] 

Jacek Furmankiewicz commented on CASSANDRA-7361:
------------------------------------------------

JNA made no difference. I ensured it is copied directly into the lib folder and I can see
it in the classpath

{quote}
 java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+CMSClassUnloadingEnabled
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8G -Xmx8G -Xmn1G -XX:+HeapDumpOnOutOfMemoryError
-Xss256k -XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseTLAB -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199
-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false
-Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -Dcassandra-foreground=yes
-cp /etc/cassandra:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-15.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/jna-4.1.0.jar:/usr/share/cassandra/lib/jna-platform-4.1.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/netty-3.6.6.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-0.3.3.jar:/usr/share/cassandra/apache-cassandra-2.0.7.jar:/usr/share/cassandra/apache-cassandra-thrift-2.0.7.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/stress.jar:
org.apache.cassandra.service.CassandraDaemon
{quote}

same issue with stuck GC

{quote}
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC   
 YGCT    FGC    FGCT     GCT   
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.4
  3698  321.750  470  4066.134 4387.884
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.4
  3698  321.750  470  4066.134 4387.884
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.4
  3698  321.750  470  4066.134 4387.884
104832.0 104832.0  0.0    0.0   838912.0 220787.2 7340032.0  7340032.0  42768.0 25546.7  
3698  321.750  470  4108.722 4430.473
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  471  4108.722 4430.473
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  471  4108.722 4430.473
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  471  4108.722 4430.473
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  471  4108.722 4430.473
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  471  4108.722 4430.473
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  471  4108.722 4430.473
104832.0 104832.0  0.0   91405.7 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7 
 3698  321.750  472  4137.400 4459.150
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  473  4137.400 4459.150
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  473  4137.400 4459.150
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  473  4137.400 4459.150
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  473  4137.400 4459.150
104832.0 104832.0  0.0   104832.0 838912.0 838912.0 7340032.0  7340032.0  42768.0 25546.7
  3698  321.750  473  4137.400 4459.150
{quote}

> Cassandra locks up in full GC when you assign the entire heap to row cache
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7361
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7361
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Ubuntu, RedHat, JDK 1.7
>            Reporter: Jacek Furmankiewicz
>            Priority: Minor
>
> We have a long running batch load process, which runs for many hours.
> Massive amount of writes, in large mutation batches (we increase the thrift frame size
to 45 MB).
> Everything goes well, but after about 3 hrs of processing everything locks up. We start
getting NoHostsAvailable exceptions on the Java application side (with Astyanax as our driver),
eventually socket timeouts.
> Looking at Cassandra, we can see that it is using nearly the full 8GB of heap and unable
to free it. It spends most of its time in full GC, but the amount of memory does not go down.
> Here is a long sample from jstat to show this over an extended time period
> e.g.
> http://aep.appspot.com/display/NqqEagzGRLO_pCP2q8hZtitnuVU/
> This continues even after we shut down our app. Nothing is connected to Cassandra any
more, yet it is still stuck in full GC and cannot free up memory.
> Running nodetool tpstats shows that nothing is pending, all seems OK:
> {quote}
> Pool Name                    Active   Pending      Completed   Blocked  All time blocked
> ReadStage                         0         0       69555935         0              
  0
> RequestResponseStage              0         0              0         0              
  0
> MutationStage                     0         0       73123690         0              
  0
> ReadRepairStage                   0         0              0         0              
  0
> ReplicateOnWriteStage             0         0              0         0              
  0
> GossipStage                       0         0              0         0              
  0
> CacheCleanupExecutor              0         0              0         0              
  0
> MigrationStage                    0         0             46         0              
  0
> MemoryMeter                       0         0           1125         0              
  0
> FlushWriter                       0         0            824         0              
 30
> ValidationExecutor                0         0              0         0              
  0
> InternalResponseStage             0         0             23         0              
  0
> AntiEntropyStage                  0         0              0         0              
  0
> MemtablePostFlusher               0         0           1783         0              
  0
> MiscStage                         0         0              0         0              
  0
> PendingRangeCalculator            0         0              1         0              
  0
> CompactionExecutor                0         0          74330         0              
  0
> commitlog_archiver                0         0              0         0              
  0
> HintedHandoff                     0         0              0         0              
  0
> Message type           Dropped
> RANGE_SLICE                  0
> READ_REPAIR                  0
> PAGED_RANGE                  0
> BINARY                       0
> READ                       585
> MUTATION                 75775
> _TRACE                       0
> REQUEST_RESPONSE             0
> COUNTER_MUTATION             0
> {quote}
> We had this happen on 2 separate boxes, one with 2.0.6, the other with 2.0.8.
> Right now this is a total blocker for us. We are unable to process the customer data
and have to abort in the middle of large processing.
> This is a new customer, so we did not have a chance to see if this occurred with 1.1
or 1.2 in the past (we moved to 2.0 recently).
> We have the Cassandra process still running, pls let us know if there is anything else
we could run to give you more insight.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message