incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01
Date Thu, 28 Mar 2013 09:18:14 GMT
"remember is used more IO than STS"

Are you meaning during compactions ? Because I thought that LCS should
decrease the number of disks reads (since 90% of the data aren't spread
across multiple sstables and C* needs to read only a file to find the
entire row) while not compacting right ?


2013/3/28 aaron morton <aaron@thelastpickle.com>

> You nailed it. A significant number of reads are done from hundreds of
> sstables ( I have to add, compaction is apparently constantly 6000-7000
> tasks behind and the vast majority of the reads access recently written
> data )
>
> So that's not good.
> If IO is saturated then maybe LCS is not for you, remember is used more IO
> than STS.
> Otherwise look at the compaction yaml settings to see if you can make it
> go faster but watch out that you don't hurt normal requests.
>
> CHeers
>
>    -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 28/03/2013, at 7:00 AM, Wei Zhu <wz1975@yahoo.com> wrote:
>
> Welcome to the wonderland of SSTableSize of LCS. There is some discussion
> around it, but no guidelines yet.
>
> I asked the people in the IRC, someone is running as high as 128M on the
> production with no problem. I guess you have to test it on your system and
> see how it performs.
>
> Attached is the related thread for your reference.
>
> -Wei
>
> ----- Original Message -----
> From: "Andras Szerdahelyi" <andras.szerdahelyi@ignitionone.com>
> To: user@cassandra.apache.org
> Sent: Wednesday, March 27, 2013 1:19:06 AM
> Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01
>
>
> Aaron,
>
>
>
>
> What version are you using ?
>
>
> 1.1.9
>
>
>
>
>
> Have you changed the bf_ chance ? The sstables need to be rebuilt for it
> to take affect.
>
>
> I did ( several times ) and I ran upgradesstables after
>
>
>
>
>
> Not sure what this means.
> Are you saying it's in a boat on a river, with tangerine trees and
> marmalade skies ?
>
>
> You nailed it. A significant number of reads are done from hundreds of
> sstables ( I have to add, compaction is apparently constantly 6000-7000
> tasks behind and the vast majority of the reads access recently written
> data )
>
>
>
>
>
> Take a look at the nodetool cfhistograms to get a better idea of the row
> size and use that info when consdiering the sstable size.
>
>
> It's around 1-20K, what should I optimise the LCS sstable size for? I
> suppose "I want to fit as many complete rows as possible in to a single
> sstable to keep file count down while avoiding compactions of oversized (
> double digit gigabytes? ) sstables at higher levels ? "
> Do I have to run a major compaction after a change to sstable_size_in_mb ?
> The larger sstable size wouldn't really affect sstables on levels above L0
> , would it?
>
>
>
>
>
>
> Thanks!!
> Andras
>
>
>
>
>
>
> From: aaron morton < aaron@thelastpickle.com >
> Reply-To: " user@cassandra.apache.org " < user@cassandra.apache.org >
> Date: Tuesday 26 March 2013 21:46
> To: " user@cassandra.apache.org " < user@cassandra.apache.org >
> Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01
>
>
>
>
> What version are you using ?
> 1.2.0 allowed a null bf chance, and I think it returned .1 for LCS and .01
> for STS compaction.
> Have you changed the bf_ chance ? The sstables need to be rebuilt for it
> to take affect.
>
>
>
>
>
> and sstables read is in the skies Not sure what this means.
> Are you saying it's in a boat on a river, with tangerine trees and
> marmalade skies ?
>
>
>
>
>
> SSTable count: 22682
>
> Lots of files there, I imagine this would dilute the effectiveness of the
> key cache. It's caching (sstable, key) tuples.
> You may want to look at increasing the sstable_size with LCS.
>
>
>
>
>
> Compacted row minimum size: 104
> Compacted row maximum size: 263210
>
>
> Compacted row mean size: 3041
> Take a look at the nodetool cfhistograms to get a better idea of the row
> size and use that info when consdiering the sstable size.
>
>
> Cheers
>
>
>
>
>
>
>
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
>
> @aaronmorton
> http://www.thelastpickle.com
>
>
> On 26/03/2013, at 6:16 AM, Andras Szerdahelyi <
> andras.szerdahelyi@ignitionone.com > wrote:
>
>
>
>
> Hello list,
>
>
> Could anyone shed some light on how an FP chance of 0.01 coexist with a
> measured FP ratio of .. 0.98 ? Am I reading this wrong or are 98% of the
> requests hitting the bloom filter create a false positive while the
> "target" false ratio is 0.01?
> ( Also key cache hit ratio is around 0.001 and sstables read is in the
> skies ( non-exponential (non-) drop off for LCS ) but that should be filed
> under "effect" and not "cause"? )
>
>
>
> [default@unknown] use KS;
> Authenticated to keyspace: KS
> [default@KS] describe CF;
> ColumnFamily: CF
> Key Validation Class: org.apache.cassandra.db.marshal.BytesType
> Default column value validator: org.apache.cassandra.db.marshal.BytesType
> Columns sorted by: org.apache.cassandra.db.marshal.BytesType
> GC grace seconds: 691200
> Compaction min/max thresholds: 4/32
> Read repair chance: 0.1
> DC Local Read repair chance: 0.0
> Replicate on write: true
> Caching: ALL
> Bloom Filter FP chance: 0.01
> Built indexes: []
> Compaction Strategy:
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy
> Compaction Strategy Options:
> sstable_size_in_mb: 5
> Compression Options:
> sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
>
>
>
> Keyspace: KS
> Read Count: 628950
> Read Latency: 93.19921121869784 ms.
> Write Count: 1219021
> Write Latency: 0.14352380885973254 ms.
> Pending Tasks: 0
> Column Family: CF
> SSTable count: 22682
> Space used (live): 119771434915
> Space used (total): 119771434915
> Number of Keys (estimate): 203837952
> Memtable Columns Count: 13125
> Memtable Data Size: 33212827
> Memtable Switch Count: 15
> Read Count: 629009
> Read Latency: 88.434 ms.
> Write Count: 1219038
> Write Latency: 0.095 ms.
> Pending Tasks: 0
> Bloom Filter False Positives: 37939419
> Bloom Filter False Ratio: 0.97928
> Bloom Filter Space Used: 261572784
> Compacted row minimum size: 104
> Compacted row maximum size: 263210
> Compacted row mean size: 3041
>
>
> I upgraded sstables after changing the FP chance
>
>
> Thanks!
> Andras
> <attachment.eml>
>
>
>

Mime
View raw message