cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Deng <>
Subject MAX_COMPACTING_L0, is it still important to enforce?
Date Thu, 18 Aug 2016 20:14:36 GMT
I was digging into LCS code lately, and found the following comments (note
the last paragraph "that would be ideal, but we can't"):

"        // The problem is that L0 has a much higher score (almost 250)
than L1 (11), so what we'll
        // do is compact a batch of MAX_COMPACTING_L0 sstables with all 117
L1 sstables, and put the
        // result (say, 120 sstables) in L1. Then we'll compact the next
        // and so forth.  So we spend most of our i/o rewriting the L1 data
with each batch.
        // If we could just do *all* L0 a single time with L1, that would
be ideal.  But we can't
        // -- see the javadoc for MAX_COMPACTING_L0."

And then when I read the MAX_COMPACTING_L0 javadoc referenced above:

"    /**
     * limit the number of L0 sstables we do at once, because compaction
bloom filter creation
     * uses a pessimistic estimate of how many keys overlap (none), so we
risk wasting memory
     * or even OOMing when compacting highly overlapping sstables

I'm starting to wonder if this is still a concern post C* 2.1 given that
we've implemented CASSANDRA-5906. Here is an excerpt from Jonathan's blog
post (
on what motivated 5906 to be implemented:

"Because bloom filters are not re-sizeable, we need to pre-allocate them at
the start of the compaction, but at the start of the compaction, we don’t
know how much the sstables being compacted overlap. Since bloom filter
performance deteriorates dramatically when over-filled, we allocate our
bloom filters to be large enough even if the sstables do not overlap at
all. Which means that if they do overlap (which they should if compaction
is doing a good job picking candidates), then we waste space — up to 100%
per sstable compacted."

Since we have 5906 to address this very issue for a few years, does it make
sense now to revisit MAX_COMPACTING_L0 choice (hard coded to 32) since the
"bloom filter wasting memory" concern is no longer there? I would imagine
this could have the potential of improving backlogged LCS behavior when we
have thousands of L0 SSTables.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message