cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6609) Reduce Bloom Filter Garbage Allocation
Date Wed, 22 Jan 2014 07:23:20 GMT


Benedict commented on CASSANDRA-6609:

bq. patch that reduces garbage by a factor of 6

Regrettably, no. This is only the bloom filter garbage, and I actually overstated - that's
closer to the max savings as things stand. Probably closer to factor of 2-3 on average. Depends
on number of bloom hashes we use. I only performed an isolated test of reads from a bloom
filter, I haven't looked at impact on a C* instance.

Still, I think it's a worthwhile tradeoff, but a bit of tweaking will probably get us closer
to original performance. I'd like to also figure out why the long[] for hash() is also not
being stack allocated. It definitely should be. We shouldn't be allocating any garbage at
all for this loop.

> Reduce Bloom Filter Garbage Allocation
> --------------------------------------
>                 Key: CASSANDRA-6609
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>         Attachments: tmp.diff
> Just spotted that we allocate potentially large amounts of garbage on bloom filter lookups,
since we allocate a new long[] for each hash() and to store the bucket indexes we visit, in
a manner that guarantees they are allocated on heap. With a lot of sstables and many requests,
this could easily be hundreds of megabytes of young gen churn per second.

This message was sent by Atlassian JIRA

View raw message