If you are starting with Cassandra I really advice you to start with 1.2.11

In 1.2+ bloomfilters are off-heap, you can use vnodes...

"I summed up the bloom filter usage reported by nodetool cfstats in all the CFs and it was under 50 MB."

This is quite a small value. Is there no error in your conversion from Bytes read in cfstats ?

If you are trying to understand this could you tell us :

- How many data do you got per node ?
- What is the value of the "index_intval" (cassandra.yaml) ?

If you are trying to fix this, you can try :

- changing the "memtable_total_space_in_mb" to 1024
- increasing the heap to 10GB.

Hope this will help somehow :).

Good luck


2013/10/16 Arindam Barua <abarua@247-inc.com>

 

During performance testing being run on our 4 node Cassandra 1.1.5 cluster, we are seeing warning logs about the heap being almost full – [1]. I’m trying to figure out why, and how to prevent it.

 

The tests are being run on a Cassandra ring consisting of 4 dedicated boxes with 32 GB of RAM each.

The heap size is set to 8 GB as recommended.

All the other relevant settings I know off are the default ones:

-          memtable_total_space_in_mb is not set in the yaml, so should default to 1/3rd the heap size.

-          They key cache should be 100 MB at the most. I checked the key cache the day after the tests were run via nodetool info, and it reported 4.5 MB being used.

-          row cache is not being used

-          I summed up the bloom filter usage reported by nodetool cfstats in all the CFs and it was under 50 MB.

 

The resident size of the cassandra process accd to top is 8.4g even now. Did a heap histogram using jmap, but not sure how to interpret those results usefully – [2].

 

Performance test details:

-          The test is write only, and is writing relatively large amount of data to one CF.

-          There is some other traffic that is constantly on that writes smaller amounts of data to many CFs, and does some reads.

 

The total number of CFs are 114, but quite a few of them are not used.

 

Thanks,

Arindam

 

[1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1:  WARN GCInspector.java (line 145) Heap is 0.8287082580489245 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically

 

[2] Object Histogram:

 

num       #instances    #bytes  Class description

--------------------------------------------------------------------------

1:              152855  86035312        int[]

2:              13395   45388008        long[]

3:              49517   9712000 java.lang.Object[]

4:              120094  8415560 char[]

5:              145106  6965088 java.nio.HeapByteBuffer

6:              40525   5891040 * ConstMethodKlass

7:              231258  5550192 java.lang.Long

8:              40525   5521592 * MethodKlass

9:              134574  5382960 java.math.BigInteger

10:             36692   4403040 java.net.SocksSocketImpl

11:             3741    4385048 * ConstantPoolKlass

12:             63875   3538128 * SymbolKlass

13:             104048  3329536 java.lang.String

14:             132636  3183264 org.apache.cassandra.db.DecoratedKey

15:             97466   3118912 java.util.concurrent.ConcurrentHashMap$HashEntry

16:             97216   3110912 com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node