From Todd Lipcon <t...@cloudera.com>
Subject Re: large heaps
Date Thu, 13 Jun 2013 05:30:47 GMT
Hey Nicolas,

I've corresponded with that guy a few times in the past -- back when i
was attempting to hack some patches into G1 for better performance on
HBase. The end result of that investigation was the MSLAB feature
which made it into 0.90.x.

The main thing I learned about GC is that big heaps aren't in
themselves problematic -- they don't tend to make young gen pauses
take longer. The only problem is if you eventually hit a
stop-the-world CMS pause, the size of the heap linearly effects the
length of the pause. So, the trick is avoiding stop-the-world CMS.

In order to avoid that, you need to do a few things:
- make sure you don't have any short-lived super-large objects: when
large objects are promoted from the young generation, they need to
find contiguous space in the old gen. If you allocate, say, a 400MB
array, even if it's short lived, it's unlikely you'll find 400MB of
contiguous space in the old gen without defragmenting. This will cause
a STW pause.

If you have some super-large objects allocated at startup, that's OK,
they'll just park themselves in the old gen and not cause trouble.

- make sure that most of your objects are "around the same size". This
prevents fragmentation build-up in the old gen.

- move big memory consumers off-heap if possible

We've done a pretty good job of the above so far, and with a bit more
careful analysis I think it's possible to fully avoid old-gen STW


On Wed, Jun 12, 2013 at 8:35 PM, Nicolas Liochon <nkeywal@gmail.com> wrote:
> Hi there,
> During the hackathon I had some discussions around GC on large heaps.
> This guy, who seems to know what he is talking about, and had a patch
> accepted in hotspot jdk, said in 2011 that he's got a configuration working
> reasonably well with large heaps at that time :
> "I was able to keep GC pause on 32Gb Oracle Coherence storage node below
> 150ms on 8 core server."
> (in http://java.dzone.com/articles/how-tame-java-gc-pauses)
> There is a lot of stuff in his blog, some of it in Russian only, but at
> least one of us will understand it.
> http://blog.ragozin.info/2011/07/openjdk-patch-cutting-down-gc-pause.html
> http://fr.slideshare.net/aragozin/garbage-collection-in-jvm-mailru-fin
> Cheers,
> Nicolas

Todd Lipcon
Software Engineer, Cloudera

