incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Algermissen <jan.algermis...@nordsc.com>
Subject Tuning for heavy write load with limited RAM
Date Fri, 06 Sep 2013 17:32:54 GMT
Trying to approach this in a bit more structured way to make it more helpful for others.

AFAIU my problem seems to be the combination of heavy write load  and very limited RAM (2GB).

C* design seems to cause nodes to run out of heap space instead of reducing processing of
incoming writes.

What I am looking for is a configuration setting that keps C* stable and puts the burden on
the client to slow down if it wants all writes to be handled. (Sure, we can scale out - but
the point here is really that I want to configure C*  to protect itself better from being
overwhelmed by writes).

What I think I understand so far is that there are four major areas to look at:

- Cache sizes
- Compaction
- GC
- Request handling

* Cache sizes * 
Cache Sizes are solely a topic relevant to reads, correct?  As I do not do reads for now,
I set everything to 0 I found about key and row caches. Anything else?

* Compaction*
I do not fully understand how compaction thresholds mix with GC. For example, there is a comment[1]
in cassandra.yaml that tells me that other config switches are far more important than the
threshold defined by flush_largest_memtables_at.

I did set
- flush_largest_memtables_at: 0.50
- CMSInitiatingOccupancyFraction: 50
Does that make sense or should I go lower, i.e. 0.1 and 10?

In addition, I have disabled compaction throttling (set to 0). Makes sense?

And I did set
- in_memory_compaction_limit_in_mb: 1
Is that actually good or bad?

* GC *
Besides CMSInitiatingOccupancyFraction I do not really have an understanding what else regarding
GC I should do to prevent the OutOfMemory error I see in my logs.

*Request Handling*
The goal here would be to find a configuration that prevents request processing when C* is
still busy writing to disk (which is what *I* want in this case, right?)

I have set
- rpc_server_type: hsha           (though I only have one client, so sync would not make a
difference)
- rpc_min_threads: 1
- rpc_max_threads: 1
(which also renders hsha vs sync irrelevant, right?)

Do you have any suggestion what I could specifically monitor to see the development of the
causes of the out of memory error and what other switches I should try out?

Jan


[1]
# emergency pressure valve: each time heap usage after a full (CMS)
# garbage collection is above this fraction of the max, Cassandra will
# flush the largest memtables.
#
# Set to 1.0 to disable.  Setting this lower than
# CMSInitiatingOccupancyFraction is not likely to be useful.
#
# RELYING ON THIS AS YOUR PRIMARY TUNING MECHANISM WILL WORK POORLY:
# it is most effective under light to moderate load, or read-heavy
# workloads; under truly massive write load, it will often be too
# little, too late.
flush_largest_memtables_at: 0.50










Mime
View raw message