cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2191) Multithread across compaction buckets
Date Sun, 10 Apr 2011 10:11:05 GMT


Sylvain Lebresne commented on CASSANDRA-2191:

Some comments on the new patches:
* I (really) don't like the stopTheWorld function. I don't like the name and arguments names
and I don't find clear the 'abstraction' it provides. But more importantly it makes the 'unmarking'
of sstable very fragile, since the markCompacting function *is not* in the 'try .. final'
block where the unmark function is. Let's just inline this stopTheWorld function instead (it
won't really be more code).
* The forcing of major compaction in doCompaction is buggy, because there is no guarantee
we haven't skipped some sstable (because of the skip option) or that some sstable has been
removed from the set by lack of disk space.  Forcing a major compaction in those case is wrong.
I know that when you submit a major compaction you could end up doing a minor one just because
a sstable got flushed between the time you grabbed the set of sstables and the time you checked
this set is still complete. But it is not a new problem (nor a big one, contrarily to wrongfully
removing tombstones) so let's open a separate ticket if we want to fix that.
* The ksname and cfname field in CompactionController should be replaced by getters (of cfs.metadata.{ksname|cfname}),
if only to make sure we don't make the renaming of ks/cf more problematic that it already
is (it's probably no problem here but avoiding the duplication of fields would be cool anyway).
* As far as I can tell we still have a problem with cache saving in that you can have two
cache saving happening at the same time (which will be bad).  To answer the proposition of
a previous comment, I am not a fan of using the flush writer for this as it has more important
things to do (and we would lose the reporting through JMX). A specific executor could solve
the problem but it is a bit overkill.  Maybe a static atomicBoolean that cache writing tasks
would atomically compare to false and set to true at the beginning. If that succeed they would
do the cache writing and set back to false the boolean, otherwise the task would just return
directly. It means it would discard a cache saving if one is already running but that's probably
fine (that's actually probably what you want).
* I would still prefer a IdentityHashSet rather than a HashMap whose value is unused in CompactionExecutor
for the sake of the clarity of the code.
* In nodetool printCompactionStats, we should keep the println with plain English as it was
before instead of using the much denser (and much less user friendly) CompactionInfo.toString().
* In SSTableWriter.Builder, when creating the compactionInfo, putting the OperationType as
it is done will look ugly and I think it will be more confusing than anything else to the
user (the previous message was probably quite enough).

FYI, my monday morning happens to be quite a few hours before the monday morning 'freeze'.
So if there is fixes for the things above by my monday morning, I'll look at it in priority.

> Multithread across compaction buckets
> -------------------------------------
>                 Key: CASSANDRA-2191
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>            Priority: Critical
>              Labels: compaction
>             Fix For: 0.8
>         Attachments: 0001-Add-a-compacting-set-to-DataTracker.txt, 0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt,
0003-Expose-multiple-compactions-via-JMX-and-a-concrete-ser.txt, 0004-Allow-multithread-compaction-to-be-disabled.txt,
> This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and reasoning
are different enough to open a separate issue.
> The problem with compactions currently is that they compact the set of sstables that
existed the moment the compaction started. This means that for longer running compactions
(even when running as fast as possible on the hardware), a very large number of new sstables
might be created in the meantime. We have observed this proliferation of sstables killing
performance during major/high-bucketed compactions.
> One approach would be to pause compactions in upper buckets (containing larger files)
when compactions in lower buckets become possible. While this would likely solve the problem
with read performance, it does not actually help us perform compaction any faster, which is
a reasonable requirement for other situations.
> Instead, we need to be able to perform any compactions that are currently required in
parallel, independent of what bucket they might be in.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message