incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <>
Subject Re: JVM 7, Cass 1.1.1 and G1 garbage collector
Date Tue, 25 Sep 2012 04:22:31 GMT
> It is not a total waste, but practically your time is better spent in other
> places. The problem is just about everything is a moving target, schema,
> request rate, hardware. Generally tuning nudges a couple variables in one
> direction or the other and you see some decent returns. But each nudge takes
> a restart and a warm up period, and with how Cassandra distributes requests
> you likely have to flip several nodes or all of them before you can see the
> change! By the time you do that its probably a different day or week.
> Essentially finding our if one setting is better then the other is like a 3
> day test in production.
> Before c* I used to deal with this in tomcat. Once in a while we would get a
> dev that read some article about tuning, something about a new jvm, or
> collector. With bright eyed enthusiasm they would want to try tuning our
> current cluster. They spend a couple days and measure something and say it
> was good "lower memory usage". Meanwhile someone else would come to me and
> say "higher 95th response time". More short pauses, fewer long pauses, great
> taste, less filing.

That's why blind blackbox testing isn't the way to go. Understanding
what the application does, what the GC does, and the goals you have in
mind is more fruitful. For example, are you trying to improve p99?
Maybe you want to improve p999 at the cost of worse p99? What about
failure modes (non-happy cases)? Perhaps you don't care about
few-hundred-ms pauses but want to avoid full gc:s? There's lots of
different goals one might have, and workloads.

Testing is key, but only in combination with some directed choice of
what to tweak. Especially since it's hard to test for for the
non-happy cases (e.g., node takes a burst of traffic and starts
promoting everything into old-gen prior to processing a request,
resulting in a death spiral).

> G1 is the perfect example of a time suck. Claims low pause latency for big
> heaps, and delivers something regarded by the Cassandra community (and hbase
> as well) that works worse then CMS. If you spent 3 hours switching tuning
> knobs and analysing, that is 3 hours of your life you will never get back.

This is similar to saying that someone told you to switch to CMS (or,
use some particular flag, etc), you tried it, and it didn't have the
result you expected.

G1 and CMS have different trade-offs. Nether one will consistently
result in better latencies across the board. It's all about the

> Better to let SUN and other people worry about tuning (at least from where I
> sit)

They're not tuning. They are providing very general purpose default
behavior, including things that make *no* sense at all with Cassandra.
For example, the default behavior with CMS is to try to make the
marking phase run as late as possible so that it finishes just prior
to heap exhaustion, in order to "optimize" for throughput; except
that's not a good idea for many cases because is exacerbates
fragmentation problems in old-gen by pushing usage very high
repeatedly, and it increases the chance of full gc because marking
started too late (even if you don't hit promotion failures due to
fragmentation). Sudden changes in workloads (e.g., compaction kicks
in) also makes it harder for CMS's mark triggering heuristics to work

As such, default options for Cassandra are use certain settings that
diverge from that of the default behavior of the JVM, because
Cassandra-in-general is much more specific a use-case than the
completely general target audience of the JVM. Similarly, a particular
cluster (with certain workloads/goals/etc) is a yet more specific
use-case than Cassandra-in-general and may be better served by
settings that differ from that of default Cassandra.

But, I certainly agree with this (which I think roughly matches what
you're saying): Don't randomly pick options someone claims is good in
a blog post and expect it to just make things better. If it were that
easy, it would be the default behavior for obvious reasons. The reason
it's not, is likely that it depends on the situation. Further, even if
you do play the lottery and win - if you don't know *why*, how are you
able to extrapolate the behavior of the system with slightly changed
workloads? It's very hard to blackbox-test GC settings, which is
probably why GC tuning can be perceived as a useless game of

/ Peter Schuller (@scode,

View raw message