incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rishi Bhardwaj <>
Subject Re: Minimizing the impact of compaction on latency and throughput
Date Wed, 07 Jul 2010 18:07:25 GMT
I have done some bulk write performance tests and I saw background compaction 
making a big detrimental impact on the write performance. I was also wondering 
if there is a tunable to limit the frequency of the compaction on the sstables. 
If not, then adding such a configuration option would also help in controlling 
the performance impact of compaction operation.


From: Peter Schuller <>
Sent: Wed, July 7, 2010 10:09:25 AM
Subject: Minimizing the impact of compaction on latency and throughput


I have repeatedly seen users report that background compaction is
overly detrimental to the behavior of the node with respect to
latency. While I have not yet deployed cassandra in a production
situation where latencies are closely monitored, these reports do not
really sound very surprising to me given the nature of compaction and
unless otherwise stated by developers here on the list I tend to
believe that it is a real issue.

Ignoring implementation difficulties for a moment, a few things that
could improve the situation, that seem sensible to me, are:

* Utilizing posix_fadvise() on both reads and writes to avoid
obliterating the operating system's caching of the sstables.
* Add the ability to rate limit disk I/O (in particular writes).
* Add the ability to perform direct I/O.
* Add the ability to fsync() regularly on writes to force the
operating system to not decide to flush hundreds of megabytes of data
out in a single burst.
* (Not an improvement but general observation: it seems useless for
writes to the commit log to remain in cache after an fsync(), and so
they are a good candidate for posix_fadvise())

None of these would be silver bullets, and the importance and
appropriate settings for each would be very dependent on operating
system, hardware, etc. But having the ability to control some or all
of these should, I suspect, allow significantly lessening the impact
of compaction under a variety of circumstances.

With respect to cache eviction, the this is one area where the impact
can probably be expected to be higher the more you rely on the
operating systems caching, and the less you rely on in-JVM caching
done by cassandra.

The most obvious problem points to me include:

* posix_fadvise() and direct I/O cause portability and building
issues, necessitating native code.
* rate limiting is very indirect due to read-ahead, caching, etc. in
particular for writes, rate limiting them would likely be almost
useless without also having fsync() or direct I/O, unless it is rate
limited to an extremely small amount and the cluster is taking very
few writes (such that the typical background flushing done by most
OS:es is done often enough to not imply huge amounts of data)

Any thoughts? Has this already been considered and rejected? Do you
think compaction is in fact not a problem already? Are there other,
easier, better ways to accomplish the goal?

/ Peter Schuller

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message