cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <>
Subject Re: Capacity problem with a lot of writes?
Date Fri, 26 Nov 2010 16:34:16 GMT
On Fri, Nov 26, 2010 at 10:49 AM, Peter Schuller
<> wrote:
>> Making compaction parallel isn't a priority because the problem is
>> almost always the opposite: how do we spread it out over a longer
>> period of time instead of sharp spikes of activity that hurt
>> read/write latency.  I'd be very surprised if latency would be
>> acceptable if you did have parallel compaction.  In other words, your
>> real problem is you need more capacity for your workload.
> Do you expect this to be true even with the I/O situation improved
> (i.e., under conditions where the additional I/O is not a problem)? It
> seems counter-intuitive to me that single-core compaction would make a
> huge impact on latency when compaction is CPU bound on a 8+ core
> system under moderate load (even taking into account cache
> coherency/NUMA etc).
> --
> / Peter Schuller


I wanted to mention a specific technique I used to solve a situation I
ran into. We had a large influx of data that pushed at our current
hardware, as stated above the true answer was more hardware. However
we ran into a situation where a single node failed several large
compactions. We failed 2 or 3 big compactions we ended up with ~1000
SSTables for a column family.

This turned into a chicken and egg situation where reads were slow
because there were many sstables and extra data like tombstones.
However the compaction was brutally slow from the read/write traffic.

My solution was to create a side by side install on the same box, I
used different data directories and different ports,
/var/lib/cassandra/compact 9168 etc, moved the data to the new install
and started it up. Then I ran nodetool compact on the new instance.
This node was seeing no read or write traffic.

I was surprised to see the machine was at 400%/1600% CPU used and not
much io-wait. Compacting 600 GB of small SSTables took about 4 days.
(However when sstables are larger I have compacted 400GB in 4 hours on
the same hardware.)

After which I moved the data file back in place and started the node
back into the cluster. I have lived on both sides of the fence where i
want long slow compactions or breakneck fast ones.

I believe there is room for other compaction models. I am interested
in systems that can optimize the case with multiple data directories
for example. It seems like from my experiment a major compaction can
not fully utilize hardware is specific conditions. Although knowing
which ones to use where and how to automatically select the optimal
strategy are interesting concerns.

View raw message