cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: Parallel Compaction
Date Fri, 17 Dec 2010 05:41:29 GMT
Hi Germán,

Thanks for taking a stab at this!

I don't actually think there are going to be any tricky race conditions with
flush or schema migration; flush has been parallel for a long time itself,
and we already have the lock in CompactionManager for schema migration.

To clean this up for submission you'd want to follow the style guide at, r/m the commented-out sections,
and add a configuration parameter for how many compactions to allow
simultaneously (IMO it mainly only makes sense to have > 1 when you are
running on SSDs, and there's no good way for us to auto-detect that).

On Thu, Dec 16, 2010 at 6:04 PM, Germán Kondolf <>wrote:

> Hi everybody,
> I've just finished the first implementation of a Parallel Compaction
> Patch for the trunk version, tomorrow I'll test it with high volumen
> of data to see if it works as I expected, but before I wan't to
> validate with you the approach.
> I know it's kinda naif, but, maybe it works as starting point for a
> future production implementation or at least allow to make
> configurable the compaction strategy.
> First of all, I don't know in depth the C* code, so maybe I took a few
> shortcuts and that's why I need a second look from an expert...
> I've modified the doCompaction method of CompactionManager, added a
> few static classes (I'm working to remove them, so V2 is coming), and
> simply splitted the sstables to compact in a balanced order and fire
> each group compaction in parallel.
> The revision I've based the patch is: 1050234
> The files are attached, the patch and the
> Thanks in advance, I'll appreciate the feedback.
> --
> //GK
> // sites

Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message