incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Seifert <fabian.seif...@frischmann.biz>
Subject OOM on replaying CommitLog with Cassandra 2.0.0
Date Tue, 05 Nov 2013 08:06:55 GMT
We are currently evaluating cassandra 2.0 to be used with a Project.

The cluster constists of 5 identical nodes each has 16Gb RAM and a 6 core Xeon and 2TB harddisk.

The heap max size is defined with 8Gig and row_Cache_size_in_mb=0

The last test was a write test, runs several days (with nearly only write requests) and inserts
850.000 keys and 55.000.000.000 columns in a single column Family resulting in about 170Gig
of data stored in total on each node. One node died with an OOM and i’m not able to bring
it up again. It keeps crashing with OOM on CommitLog replay:

ERROR [MutationStage:20] 2013-10-30 08:35:23,160 CassandraDaemon.java (line 186) Exception
in thread Thread[MutationStage:20,5,main]
java.lang.OutOfMemoryError: Java heap space
 at edu.stanford.ppl.concurrent.SnapTreeMap.comparable(SnapTreeMap.java:534)
 at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1019)
 at edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985)
 at org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:312)
 at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:184)
 at org.apache.cassandra.db.Memtable.resolve(Memtable.java:255)
 at org.apache.cassandra.db.Memtable.put(Memtable.java:171)
 at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:842)
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:373)
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:338)
 at org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:265)
 at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
 at java.util.concurrent.FutureTask.run(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)





I have also a heap dump available for the crash during startup.

The CommitLog dir has a total size of nearly 3Gig.




I know that i can clean the commitLog dir to bring the node up, since it is only test data
it is no Problem for us. But the more interesting is how can we prevent that?




Regards

Fabian Seifert
Mime
View raw message