cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Mačura <m.mac...@gmail.com>
Subject Re: Bloom filter false positives high
Date Wed, 17 Apr 2019 10:48:41 GMT
Both tables use the default bloom_filter_fp_chance of 0.01 ...

CREATE TABLE ... (
   a int,
   b int,
   bucket timestamp,
   ts timeuuid,
   c int,
...
   PRIMARY KEY ((a, b, bucket), ts, c)
) WITH CLUSTERING ORDER BY (ts DESC, monitor ASC)
   AND bloom_filter_fp_chance = 0.01
   AND compaction = {'class':
'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
'compaction_window_size': '1', 'compaction_window_unit': 'DAYS',
'tombstone_threshold': '0.9', 'unchecked_tombstone_compaction':
'false'}
   AND dclocal_read_repair_chance = 0.0
   AND default_time_to_live = 63072000
   AND gc_grace_seconds = 10800
...
   AND read_repair_chance = 0.0
   AND speculative_retry = 'NONE';


CREATE TABLE ... (
   c int,
   b int,
   bucket timestamp,
   ts timeuuid,
...
   PRIMARY KEY ((c, b, bucket), ts)
) WITH CLUSTERING ORDER BY (ts DESC)
   AND bloom_filter_fp_chance = 0.01
   AND compaction = {'class':
'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
'compaction_window_size': '1', 'compaction_window_unit': 'DAYS',
'tombstone_threshold': '0.9', 'unchecked_tombstone_compaction':
'false'}
   AND dclocal_read_repair_chance = 0.0
   AND default_time_to_live = 63072000
   AND gc_grace_seconds = 10800
...
   AND read_repair_chance = 0.0
   AND speculative_retry = 'NONE';

On Wed, Apr 17, 2019 at 12:25 PM Stefan Miklosovic <
stefan.miklosovic@instaclustr.com> wrote:

> What is your bloom_filter_fp_chance for either table? I guess it is
> bigger for the first one, bigger that number is between 0 and 1, less
> memory it will use (17 MiB against 54.9 Mib) which means more false
> positives you will get.
>
> On Wed, 17 Apr 2019 at 19:59, Martin Mačura <m.macura@gmail.com> wrote:
> >
> > Hi,
> > I have a table with poor bloom filter false ratio:
> >                SSTable count: 1223
> >                Space used (live): 726.58 GiB
> >                Number of partitions (estimate): 8592749
> >                Bloom filter false positives: 35796352
> >                Bloom filter false ratio: 0.68472
> >                Bloom filter space used: 17.82 MiB
> >                Compacted partition maximum bytes: 386857368
> >
> > It's a time series, TWCS compaction, window size 1 day, data partitioned
> in daily buckets, TTL 2 years.
> >
> > I have another table with a similar schema, but it is not affected for
> some reason:
> >                SSTable count: 1114
> >                Space used (live): 329.87 GiB
> >                Number of partitions (estimate): 25460768
> >                Bloom filter false positives: 156942
> >                Bloom filter false ratio: 0.00010
> >                Bloom filter space used: 54.9 MiB
> >                Compacted partition maximum bytes: 20924300
> >
> > Thanks for any advice,
> >
> > Martin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Mime
View raw message