cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Desimpel, Ignace" <>
Subject Endless loop LCS compaction
Date Thu, 07 Nov 2013 11:48:14 GMT
Tested on version 2.0.1 and 2.0.2.

At complete idle running (nothing stored nor queried) I see that some random (depending on
the tests I do) column family gets compacted over and over again (already 48h) . Total data
size is only 3.5GB. Column family was created with SSTableSize : 10 MB

Using some remote debugging I see (guess) that the loop is created due to some extra code
at LeveledManifest::getCompactionCandidates in an attempt to use STCS if compaction gets behind
(see code variable 'score').

In my case I get the following variables during execution (version 2.0.2) of LeveledManifest::getCompactionCandidates
level : 1
sstablesInLevel : 42
remaining : 42
total bytes for remaining : 448 MB
max size for level : 100MB
score 4.27
Due to score of 4.27 the code goes to special branch with variables during execution (version
2.0.2) of LeveledManifest::getCompactionCandidates :
                Generations[0].size() : 77
                Candidates : 77
                Pairs : 77
Buckets : one list entry of 77 files
mostInteresting : 32 files

These 32 mostInteresting files are returned to the function LeveledCompactionStrategy::getMaximalTask,
marked for compaction, and a new LeveledCompactionTask is created (Is it not the goal here
to create an STCS task??) !!

So then this task is doing its job, creating a new set of level 0 files, each of 10 MB. Thus
again 32 files are created from the 32 files we started from. So once the compaction loop
restarts, it will do exactly the same thing again and again.

Should an STCS task be created from within the LCS strategy? Or the optimization simply be

Ignace Desimpel

View raw message