cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <>
Subject Possible Bug: bucket_low has no effect in STCS
Date Mon, 13 Jun 2016 17:45:31 GMT

I am trying to understand the algorithm of STCS. As per my current understanding of the code,
there seems to be no impact of setting bucket_low in the STCS compaction algorithm. Moreover,
I see some optimization. I would appreciate if some designer can correct me or confirm that
it's a bug sonthat I can raise a JIRA.

getBuckets() method of SizeTieredCompactionStrategy sorts sstables by size in ascending order
and then iterates over them one by one to associate them to an existing/new bucket. When,
iterating sstables in ascending order of size, I can't find ANY single scenario where the
current sstable in the outer loop iteration is below the oldAverageSize of any existing bucket.
Current sstable being iterated will ALWAYS be greater than/equal to the oldAverageSize of
ALL existing buckets as ALL previous sstables in existing buckets were smaller/equal in size
to the sstable being iterated.

So, there is NO scenario when size > (oldAverageSize * bucketLow) and size < oldAverageSize,
so bucket_low property never comes into play no matter what value you set for it.

Also, while iteraitng over sstables (sortedfiles) by size in ascending order, there is no
point iterating over all existing buckets. We could just start from the LAST bucket where
previous sstable was associated.  oldAverageSize of ALL other buckets will NEVER allow the
sstable being iterated.

 for (Entry<Long, List<T>> entry : buckets.entrySet())



View raw message