I know this subject has been discussed in the past on the list and I've read through all discussions but I haven't been able to find a solution to the memory problems listed below... so here again...
It seems that the cassandra cluster I'm using is either leaking memory or just using more mem than I expected it to use.
Each host in the ring uses about 12G of ram while in some cases its entire dataset is only 1.5G (take for example .252.124 below with 1.54G)
I use extensive row caching so I expect memory consumption to be >= 1.5G but I don't understand why it gets up to 12G. Most of the times I don't care so much since I have plenty of memory however at times this gets me into GC storms and very slow responses. Also, I'd like to be able to load more data to the cluster and I'm hitting the memory wall, which I didn't expect.
In the cassandra.in.sh
you'd notice that I do provide Xmx=12G but given that there's so little data I wouldn't expect the process to be using all of that. As a matter of fact I wanted to insert more data to the cluster but I stopped since it wasn't handling the load very well.
I suppose that at the end of the day I only need to know which knobs configure but after having played with the configuration for a long time I'm a little lost.
I'm running a 0.6.2 cluster consisting of 6 physical hosts (some with 16G and some 32G ram) distributed b/w two DCs.
RF is 2 (one replica in each DC).
HH is turned off.
File access is standard (no m-mapped files, I tried that and the system just kept swapping itself to death so I switched back to normal).
I've pasted below the output of nodetool ring and cfstats as well as some vmstat and iostat (not that I think it matters...)
Also jmap -heap and attached is the jmap -histo so I hope this output can help shed some light on memory usage.
Currently the logs don't say anything out of the ordinary so I didn't include them.
$ nodetool -h cass99 -p 9004 ring
Address Status Load Range Ring
192.168.252.99Up 6.16 GB 28356863910078205288614550619314017621 |<--|
192.168.252.124Up 1.54 GB 56713727820156410577229101238628035242 | ^
192.168.252.125Up 1.54 GB 85070591730234615865843651857942052863 v |
192.168.254.57Up 6.15 GB 113427455640312821154458202477256070485 | ^
192.168.254.58Up 1.54 GB 141784319550391026443072753096570088106 v |
192.168.254.59Up 1.54 GB 170141183460469231731687303715884105727 |-->|
<ColumnFamily CompareWith="BytesType" Name="KvImpressions"
<ColumnFamily CompareWith="BytesType" Name="KvAds"
<ColumnFamily CompareWith="BytesType" Name="KvRatings"
# Licensed to the Apache Software Foundation (ASF) under one
# Arguments to pass to the JVM
Read Count: 5608010
Read Latency: 8.52211627029909 ms.
Write Count: 42794
Write Latency: 0.10353956162078796 ms.
Pending Tasks: 0
Column Family: KvAds
SSTable count: 11
Space used (live): 9331647391
Space used (total): 9331647391
Memtable Columns Count: 84928
Memtable Data Size: 21400502
Memtable Switch Count: 1
Read Count: 5602705
Read Latency: 2.023 ms.
Write Count: 42794
Write Latency: 0.060 ms.
Pending Tasks: 0
Key cache: disabled
Row cache capacity: 10000000
Row cache size: 698671
Row cache hit rate: 0.5535463700149053
Compacted row minimum size: 391
Compacted row maximum size: 76890
Compacted row mean size: 635
top - 10:23:26 up 96 days, 23:04, 1 user, load average: 5.03, 6.21, 6.08
Tasks: 93 total, 1 running, 92 sleeping, 0 stopped, 0 zombie
Cpu(s): 92.1%us, 4.1%sy, 0.0%ni, 1.8%id, 0.0%wa, 0.5%hi, 1.5%si, 0.0%st
Mem: 16443880k total, 16357676k used, 86204k free, 43448k buffers
Swap: 4194296k total, 13912k used, 4180384k free, 2625024k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5757 cassandr 25 0 13.6g 12g 9860 S 197.2 82.3 9445:17 java