incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01
Date Thu, 28 Mar 2013 03:17:34 GMT
> You nailed it. A significant number of reads are done from hundreds of sstables ( I have
to add, compaction is apparently constantly 6000-7000 tasks behind and the vast majority of
the reads access recently written data )
So that's not good. 
If IO is saturated then maybe LCS is not for you, remember is used more IO than STS. 
Otherwise look at the compaction yaml settings to see if you can make it go faster but watch
out that you don't hurt normal requests. 

CHeers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 28/03/2013, at 7:00 AM, Wei Zhu <wz1975@yahoo.com> wrote:

> Welcome to the wonderland of SSTableSize of LCS. There is some discussion around it,
but no guidelines yet. 
> 
> I asked the people in the IRC, someone is running as high as 128M on the production with
no problem. I guess you have to test it on your system and see how it performs. 
> 
> Attached is the related thread for your reference.
> 
> -Wei
> 
> ----- Original Message -----
> From: "Andras Szerdahelyi" <andras.szerdahelyi@ignitionone.com>
> To: user@cassandra.apache.org
> Sent: Wednesday, March 27, 2013 1:19:06 AM
> Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01
> 
> 
> Aaron, 
> 
> 
> 
> 
> What version are you using ? 
> 
> 
> 1.1.9 
> 
> 
> 
> 
> 
> Have you changed the bf_ chance ? The sstables need to be rebuilt for it to take affect.

> 
> 
> I did ( several times ) and I ran upgradesstables after 
> 
> 
> 
> 
> 
> Not sure what this means. 
> Are you saying it's in a boat on a river, with tangerine trees and marmalade skies ?

> 
> 
> You nailed it. A significant number of reads are done from hundreds of sstables ( I have
to add, compaction is apparently constantly 6000-7000 tasks behind and the vast majority of
the reads access recently written data ) 
> 
> 
> 
> 
> 
> Take a look at the nodetool cfhistograms to get a better idea of the row size and use
that info when consdiering the sstable size. 
> 
> 
> It's around 1-20K, what should I optimise the LCS sstable size for? I suppose "I want
to fit as many complete rows as possible in to a single sstable to keep file count down while
avoiding compactions of oversized ( double digit gigabytes? ) sstables at higher levels ?
" 
> Do I have to run a major compaction after a change to sstable_size_in_mb ? The larger
sstable size wouldn't really affect sstables on levels above L0 , would it? 
> 
> 
> 
> 
> 
> 
> Thanks!! 
> Andras 
> 
> 
> 
> 
> 
> 
> From: aaron morton < aaron@thelastpickle.com > 
> Reply-To: " user@cassandra.apache.org " < user@cassandra.apache.org > 
> Date: Tuesday 26 March 2013 21:46 
> To: " user@cassandra.apache.org " < user@cassandra.apache.org > 
> Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01 
> 
> 
> 
> 
> What version are you using ? 
> 1.2.0 allowed a null bf chance, and I think it returned .1 for LCS and .01 for STS compaction.

> Have you changed the bf_ chance ? The sstables need to be rebuilt for it to take affect.

> 
> 
> 
> 
> 
> and sstables read is in the skies Not sure what this means. 
> Are you saying it's in a boat on a river, with tangerine trees and marmalade skies ?

> 
> 
> 
> 
> 
> SSTable count: 22682 
> 
> Lots of files there, I imagine this would dilute the effectiveness of the key cache.
It's caching (sstable, key) tuples. 
> You may want to look at increasing the sstable_size with LCS. 
> 
> 
> 
> 
> 
> Compacted row minimum size: 104 
> Compacted row maximum size: 263210 
> 
> 
> Compacted row mean size: 3041 
> Take a look at the nodetool cfhistograms to get a better idea of the row size and use
that info when consdiering the sstable size. 
> 
> 
> Cheers 
> 
> 
> 
> 
> 
> 
> 
> 
> ----------------- 
> Aaron Morton 
> Freelance Cassandra Consultant 
> New Zealand 
> 
> 
> @aaronmorton 
> http://www.thelastpickle.com 
> 
> 
> On 26/03/2013, at 6:16 AM, Andras Szerdahelyi < andras.szerdahelyi@ignitionone.com
> wrote: 
> 
> 
> 
> 
> Hello list, 
> 
> 
> Could anyone shed some light on how an FP chance of 0.01 coexist with a measured FP ratio
of .. 0.98 ? Am I reading this wrong or are 98% of the requests hitting the bloom filter create
a false positive while the "target" false ratio is 0.01? 
> ( Also key cache hit ratio is around 0.001 and sstables read is in the skies ( non-exponential
(non-) drop off for LCS ) but that should be filed under "effect" and not "cause"? ) 
> 
> 
> 
> [default@unknown] use KS; 
> Authenticated to keyspace: KS 
> [default@KS] describe CF; 
> ColumnFamily: CF 
> Key Validation Class: org.apache.cassandra.db.marshal.BytesType 
> Default column value validator: org.apache.cassandra.db.marshal.BytesType 
> Columns sorted by: org.apache.cassandra.db.marshal.BytesType 
> GC grace seconds: 691200 
> Compaction min/max thresholds: 4/32 
> Read repair chance: 0.1 
> DC Local Read repair chance: 0.0 
> Replicate on write: true 
> Caching: ALL 
> Bloom Filter FP chance: 0.01 
> Built indexes: [] 
> Compaction Strategy: org.apache.cassandra.db.compaction.LeveledCompactionStrategy 
> Compaction Strategy Options: 
> sstable_size_in_mb: 5 
> Compression Options: 
> sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor 
> 
> 
> 
> Keyspace: KS 
> Read Count: 628950 
> Read Latency: 93.19921121869784 ms. 
> Write Count: 1219021 
> Write Latency: 0.14352380885973254 ms. 
> Pending Tasks: 0 
> Column Family: CF 
> SSTable count: 22682 
> Space used (live): 119771434915 
> Space used (total): 119771434915 
> Number of Keys (estimate): 203837952 
> Memtable Columns Count: 13125 
> Memtable Data Size: 33212827 
> Memtable Switch Count: 15 
> Read Count: 629009 
> Read Latency: 88.434 ms. 
> Write Count: 1219038 
> Write Latency: 0.095 ms. 
> Pending Tasks: 0 
> Bloom Filter False Positives: 37939419 
> Bloom Filter False Ratio: 0.97928 
> Bloom Filter Space Used: 261572784 
> Compacted row minimum size: 104 
> Compacted row maximum size: 263210 
> Compacted row mean size: 3041 
> 
> 
> I upgraded sstables after changing the FP chance 
> 
> 
> Thanks! 
> Andras 
> <attachment.eml>


Mime
View raw message