cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-5906) Avoid allocating over-large bloom filters
Date Fri, 22 Nov 2013 18:40:35 GMT


Yuki Morishita commented on CASSANDRA-5906:

So far, I tested HLL++ alone for serialized size and error% with various parameters.

We can reduce the size from originally posted here (p=16, sp=0), down to less than 10k for
p=13, sp=25. Using the sparse mode, we can save space for smaller number of partitions.
I think relative error 2% of estimated partition size is tolerable for constructing bloom
filter. (though I don't have formula to prove it :P)

> Avoid allocating over-large bloom filters
> -----------------------------------------
>                 Key: CASSANDRA-5906
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>             Fix For: 2.1
> We conservatively estimate the number of partitions post-compaction to be the total number
of partitions pre-compaction.  That is, we assume the worst-case scenario of no partition
overlap at all.
> This can result in substantial memory wasted in sstables resulting from highly overlapping

This message was sent by Atlassian JIRA

View raw message