incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill de hÓra <b...@dehora.net>
Subject Re: High read latency cluster
Date Fri, 08 Feb 2013 17:52:56 GMT
> FlushWriter                       0         0           8252         0               299

If you are not suffering from gc pressure/pauses (possibly not, because you don't seem to
have a lot of read failures in tpstats or outlier latency on the histograms), then the flush
writer errors are suggestive of memtable pressure, which may be followed by compactions that
grind the disk. 

> maybe because of bloomfilters

 If you think bloom filters or indexes are occupying heap on startup, then you could alleviate
things for a while with memtable/cache tuning, resampling the index interval, or increasing
the heap to 10G (yes, not generally recommended).  Not enough working ram can also impact
the key cache which then puts more pressure on disk - check info to see if your caches are
being resized. If then disk I/O is simply be falling behind in conjunction with a server that
doesn't have much memory headroom, then you'll want to expand the cluster at some point to
spread out the load. 

Bill


On 8 Feb 2013, at 13:03, Alain RODRIGUEZ <arodrime@gmail.com> wrote:

> Hi,
> 
> I have some big latencies (OpsCenter homepage shows an average about 30-60 ms), inducing
instability in my front servers, stacking queries, waiting for C* to answer, in the following
1.1.6 C* cluster:
> 
> 10.208.45.173   eu-west     1b          Up     Normal  297.02 GB       100.00%      
      0
> 10.208.40.6      eu-west     1b          Up     Normal  292.91 GB       100.00%     
       56713727820156407428984779325531226112
> 10.208.47.135   eu-west     1b          Up     Normal  307.96 GB       100.00%      
      113427455640312814857969558651062452224
> 
> I run on 3 AWS m1.xLarge with mostly the Datastax AMI default node configuration (But
with the following options MAX_HEAP_SIZE="8G" HEAP_NEWSIZE="400M", I was under regular memory
pressure with the default 4GB heap, maybe because of bloomfilters). RF = 3, CL = QUORUM r/w.
> 
> I have a high load from 4 to 15 with an average of 8 (mainly because of iowait which
can reach up to 40-60%).
> 
> extract from "iostat -mx 5 10" :
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>                 16.66    0.00    4.82         35.47       0.21   42.85
> 
> 
> I use compression and Size Tiered Compaction Strategy for any of my CF.
> 
> A typical CF :
> 
> create column family active_product
>   with column_type = 'Standard'
>   and comparator = 'UTF8Type'
>   and default_validation_class = 'UTF8Type'
>   and key_validation_class = 'UTF8Type'
>   and read_repair_chance = 0.1
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 12
>   and replicate_on_write = true
>   and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and caching = 'KEYS_ONLY'
>   and bloom_filter_fp_chance = 0.01
>   and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'};
> 
> And there is a typical counter CF:
> 
> create column family algo_product_view
>   with column_type = 'Standard'
>   and comparator = 'UTF8Type'
>   and default_validation_class = 'CounterColumnType'
>   and key_validation_class = 'UTF8Type'
>   and read_repair_chance = 0.1
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 12
>   and replicate_on_write = true
>   and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and caching = 'KEYS_ONLY'
>   and bloom_filter_fp_chance = 0.01
>   and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'};
> 
> I attach my cfhistograms, proxyhistograms, cfstats and tpstats hopping a clue is somewhere
in there, even if I was unable to learn something there by myself.
> 
> cfstats: http://pastebin.com/z3sAshjP
> tpstats: http://pastebin.com/LETPqfLV
> proxyhistograms: http://pastebin.com/FqwMFrxG
> cfhistograms (from the 2 most read / highest latencies):  http://pastebin.com/BCsdc50z
& http://pastebin.com/CGZZpydL
> 
> These latencies are quite annoying, hope you'll help me figuring out what I am doing
wrong or how I can tune Cassandra better.
> 
> Alain 


Mime
View raw message