cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <>
Subject Re: multithreaded compaction
Date Tue, 26 Apr 2011 07:35:26 GMT
On Tue, Apr 26, 2011 at 9:01 AM, Terje Marthinussen
<> wrote:
> Hi,
> I was testing the multithreaded compactions and with 2x6 cores (24 with HT)
> it does seem a bit crazy with 24 compactions running concurrently.
> It is probably not very good in terms of random I/O.

It does seems a bit overkill. However, I'm slightly curious how you
ended up with 24 parallel
compactions, more precisely, how did you end up with enough sstables
to trigger 24
compactions ? Was that done on purpose for testing sake, or did you
saw that in a real
situation ?

I'm asking because in 'real' situation, given that compaction are
triggered only if there is
some number of files to compact, and provided the cluster is correctly
provisioned, I wouldn't
expect the number of parallel compaction to jump to such numbers (one
of the goal of
multi_treaded compaction was to make sure we never end up accumulating
lots of un-compacted
sstables). Anyway, I get your point, just wondering if that was a real

> As such, I think I agree with the argument in 2191 that there should be a
> config option for this.
> Probably a default that is dynamic with 1 thread per column family +2 or 3
> thread for parallel compactions outside of that could be good.
> Any other opinions?

Multi-threaded compaction is optional and compaction throttling is
supposed to mitigage
it. However I do agree that too much many compactions may be a bad use
of resources
because of random IO even if correctly throttled. I think it's missing
a configuration option
"concurrent_compactions" like there is a "concurrent_writes|reads".
For that, I have created

> I guess the compaction thread pool should also show up in tpstats?

Yes it should ... and it will ... eventually :)

Thanks for the feedback.


View raw message