incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Zhu <wz1...@yahoo.com>
Subject Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01
Date Wed, 27 Mar 2013 18:00:29 GMT
Welcome to the wonderland of SSTableSize of LCS. There is some discussion around it, but no
guidelines yet. 

I asked the people in the IRC, someone is running as high as 128M on the production with no
problem. I guess you have to test it on your system and see how it performs. 

Attached is the related thread for your reference.

-Wei

----- Original Message -----
From: "Andras Szerdahelyi" <andras.szerdahelyi@ignitionone.com>
To: user@cassandra.apache.org
Sent: Wednesday, March 27, 2013 1:19:06 AM
Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01


Aaron, 




What version are you using ? 


1.1.9 





Have you changed the bf_ chance ? The sstables need to be rebuilt for it to take affect. 


I did ( several times ) and I ran upgradesstables after 





Not sure what this means. 
Are you saying it's in a boat on a river, with tangerine trees and marmalade skies ? 


You nailed it. A significant number of reads are done from hundreds of sstables ( I have to
add, compaction is apparently constantly 6000-7000 tasks behind and the vast majority of the
reads access recently written data ) 





Take a look at the nodetool cfhistograms to get a better idea of the row size and use that
info when consdiering the sstable size. 


It's around 1-20K, what should I optimise the LCS sstable size for? I suppose "I want to fit
as many complete rows as possible in to a single sstable to keep file count down while avoiding
compactions of oversized ( double digit gigabytes? ) sstables at higher levels ? " 
Do I have to run a major compaction after a change to sstable_size_in_mb ? The larger sstable
size wouldn't really affect sstables on levels above L0 , would it? 






Thanks!! 
Andras 






From: aaron morton < aaron@thelastpickle.com > 
Reply-To: " user@cassandra.apache.org " < user@cassandra.apache.org > 
Date: Tuesday 26 March 2013 21:46 
To: " user@cassandra.apache.org " < user@cassandra.apache.org > 
Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01 




What version are you using ? 
1.2.0 allowed a null bf chance, and I think it returned .1 for LCS and .01 for STS compaction.

Have you changed the bf_ chance ? The sstables need to be rebuilt for it to take affect. 





and sstables read is in the skies Not sure what this means. 
Are you saying it's in a boat on a river, with tangerine trees and marmalade skies ? 





SSTable count: 22682 

Lots of files there, I imagine this would dilute the effectiveness of the key cache. It's
caching (sstable, key) tuples. 
You may want to look at increasing the sstable_size with LCS. 





Compacted row minimum size: 104 
Compacted row maximum size: 263210 


Compacted row mean size: 3041 
Take a look at the nodetool cfhistograms to get a better idea of the row size and use that
info when consdiering the sstable size. 


Cheers 








----------------- 
Aaron Morton 
Freelance Cassandra Consultant 
New Zealand 


@aaronmorton 
http://www.thelastpickle.com 


On 26/03/2013, at 6:16 AM, Andras Szerdahelyi < andras.szerdahelyi@ignitionone.com >
wrote: 




Hello list, 


Could anyone shed some light on how an FP chance of 0.01 coexist with a measured FP ratio
of .. 0.98 ? Am I reading this wrong or are 98% of the requests hitting the bloom filter create
a false positive while the "target" false ratio is 0.01? 
( Also key cache hit ratio is around 0.001 and sstables read is in the skies ( non-exponential
(non-) drop off for LCS ) but that should be filed under "effect" and not "cause"? ) 



[default@unknown] use KS; 
Authenticated to keyspace: KS 
[default@KS] describe CF; 
ColumnFamily: CF 
Key Validation Class: org.apache.cassandra.db.marshal.BytesType 
Default column value validator: org.apache.cassandra.db.marshal.BytesType 
Columns sorted by: org.apache.cassandra.db.marshal.BytesType 
GC grace seconds: 691200 
Compaction min/max thresholds: 4/32 
Read repair chance: 0.1 
DC Local Read repair chance: 0.0 
Replicate on write: true 
Caching: ALL 
Bloom Filter FP chance: 0.01 
Built indexes: [] 
Compaction Strategy: org.apache.cassandra.db.compaction.LeveledCompactionStrategy 
Compaction Strategy Options: 
sstable_size_in_mb: 5 
Compression Options: 
sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor 



Keyspace: KS 
Read Count: 628950 
Read Latency: 93.19921121869784 ms. 
Write Count: 1219021 
Write Latency: 0.14352380885973254 ms. 
Pending Tasks: 0 
Column Family: CF 
SSTable count: 22682 
Space used (live): 119771434915 
Space used (total): 119771434915 
Number of Keys (estimate): 203837952 
Memtable Columns Count: 13125 
Memtable Data Size: 33212827 
Memtable Switch Count: 15 
Read Count: 629009 
Read Latency: 88.434 ms. 
Write Count: 1219038 
Write Latency: 0.095 ms. 
Pending Tasks: 0 
Bloom Filter False Positives: 37939419 
Bloom Filter False Ratio: 0.97928 
Bloom Filter Space Used: 261572784 
Compacted row minimum size: 104 
Compacted row maximum size: 263210 
Compacted row mean size: 3041 


I upgraded sstables after changing the FP chance 


Thanks! 
Andras 

Mime
View raw message