incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nils Pommerien <>
Subject Cassandra and massive TTL expirations cause HEAP issue
Date Tue, 26 Jun 2012 14:38:12 GMT
I am evaluating Cassandra in a log retrieval application.  My ring conists of3 m2.xlarge instances
(17.1 GB memory, 6.5 ECU (2 virtual cores with 3.25 EC2 Compute Units each), 420 GB of local
instance storage, 64-bit platform) and I am writing at roughly 220 writes/sec.  Per day I
am adding roughly 60GB of data.  All of this sounds simple and easy and all three nodes are
humming along with basically no load.

The issue is that I am writing all my data with a TTL of 10 days.  After 10 days my cluster
crashes due to a java.lang.OutOfMemoryError during compaction of the big column family that
contains roughly 95% of the data.  So basically after 10 days my data set is 600GB and after
10 days Cassandra would have to tombstone and purge 60GB of data at the same rate of roughly
220 deletes/second.  I am not sure if Cassandra should be able to do it, whether I should
take a partitioning approach (one CF per day), or if there is simply some tweaks I need to
make in the yaml file.  I have tried:

 1.  Decrease flush-largest-memtables-at to .4
 2.  reduce_cache_sizes_at and reduce_cache_capacity_to set to 1

Now, the issue remains the same:

WARN [ScheduledTasks:1] 2012-06-11 19:39:42,017 (line 145) Heap is 0.9920103380107628
full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to
the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold
in cassandra.yaml if you don't want Cassandra to do this automatically.

Eventually it will just die with this message.  This affects all nodes in the cluster, not
just one.

Dump file is incomplete: file size limit
ERROR 19:39:39,695 Exception in thread Thread[ReadStage:134,5,main]
java.lang.OutOfMemoryError: Java heap space
ERROR 19:39:39,724 Exception in thread Thread[MutationStage:57,5,main]
java.lang.OutOfMemoryError: Java heap space
      at org.apache.cassandra.utils.FBUtilities.hashToBigInteger(
      at org.apache.cassandra.dht.RandomPartitioner.getToken(
      at org.apache.cassandra.dht.RandomPartitioner.decorateKey(
      at org.apache.cassandra.db.RowPosition.forKey(

Any help is highly appreciated.  It would be cool to tweak it in a way that I can have a moving
window of 10 days in Cassandra while dropping the old data… Or, if there is any other recommended
way to deal with such sliding time windows I am open for ideas.

Thank you for your help!

View raw message