hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2902) Improve our default shipping GC config. and doc -- along the way do a bit of GC myth-busting
Date Tue, 31 Aug 2010 16:49:54 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904678#action_12904678

stack commented on HBASE-2902:

Above proposal is to enable GC logging as default.  Unfortunately, we can't just yet.  While
GC logging is apparently near friction-free, the lack of a rotation of live logs makes enabling
untenable (Here is latest on rotating logs http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2010-May/000597.html).

> Improve our default shipping GC config. and doc -- along the way do a bit of GC myth-busting
> --------------------------------------------------------------------------------------------
>                 Key: HBASE-2902
>                 URL: https://issues.apache.org/jira/browse/HBASE-2902
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>         Attachments: Fragger.java
> This issue is about improving the near-term story, working with our current lot, the
slowly evolving (?) 1.6x JVMs and CMS (Longer-term, another issue in hbase tracks the G1 story
and longer term, Todd is making a bit of traction over on the GC hotspot list).  
> At the moment we ship with CMS and i-CMS enabled by default.   At a minimum, i-cms does
not apply on most hw hbase is deployed on -- i-cms is for hw w/ 2 or less processors -- and
it seems as though we do not use multiple threads doing YG collections; i.e. -XX:UseParNewGC
"Use parallel threads in the new generation" (Here's what I see...it seems to be off in jdk6
according to http://www.md.pp.ru/~eu/jdk6options.html#UseParNewGC  but then this says its
on by default when use CMS -> http://blogs.sun.com/jonthecollector/category/Java ... but
then this says enable it http://www.austinjug.org/presentations/JDK6PerfUpdate_Dec2009.pdf.
 I see this when its enabled: [Rescan (parallel) ... so it seems like its off.  Need to review
the src code).
> We should make the above changes or at least doc them.
> We should consider enabling GC logging by default.  Its low cost apparently (citation
below).  We'd just need to do something about the log management.  Not sure you can roll them
-- investigate -- and anyways we should roll on startup at least so we don't lose GC logs
across restarts.
> We should play with initiating ratios; maybe starting CMS earlier will push out the fragmented
heap that brings on the killer stop-the-world collection.
> I read somewhere recently that invoking System.gc will run a CMS GC if CMS is enabled.
 We should investigate.  If it ran the serial collector, we could at least doc. that users
could run a defragmenting stop-the-world serial collection on 'off' times or at least make
it so the stop-the-world happened when expected instead of at some random time.
> While here, lets do a bit of myth-busting.  Here's a few postulates:
> + Keep the young generation small or at least, cap its size else it grows to occupy a
large part of the heap
> The above is a Ryanism.  Doing the above -- along w/ massive heap size -- has put off
the fragmentation that others run into at SU at least.
> Interestingly, this document -- http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBcQFjAA&url=http%3A%2F%2Fmediacast.sun.com%2Fusers%2FLudovic%2Fmedia%2FGCTuningPresentationFISL10.pdf&ei=ZPtaTOiLL5bcsAa7gsl1&usg=AFQjCNHP691SIIE-6NSKccM4mZtm1U6Ahw&sig2=2cjvcaeyn1aISL2THEENjQ
-- would seem to recommend near the opposite in that it suggests that when using CMS, do all
you can to keep stuff in the YG.  Avoid having stuff age up to the tenured heap if you can.
 This would seem imply using a larger YG.
> Chatting w/ Ryan, the reason to keep the YG small is so we don't have long pauses doing
YG collections.  According to the above citation, its not big YGs that cause long YG pauses
but the copying of data (not sure if its copying of data inside the YG or if it meant copying
up to tenured -- chatting w/ Ryan we thought there'd be no difference -- but we should investigate)
> I look a look at a running upload with a small heap admittedly.  What I was seeing was
that using our defaults, rare was anything in YG of age > 1 GC; i.e. near everything in
YG was being promoted.  This may have been a symptom of my small (default) heap but we should
look into this and try and ensure objects are promoted because they are old, not because there
is not enough space in YG. 
> + We should write a slab allocator or allocate memory outside of the JVM heap
> Thinking on this, slab allocator, while a lot of work, I can see it helping us w/ block
cache, but what if memstore is the fragmented-heap maker?  In this case, slab-allocator is
only part of the fix.  It should be easy to see which is the fragmented heap maker since we
can turn off the cache easy enough (though it seems like its accessed anyways even if disabled
-- need to make sure its not doing allocations to the cache in this case)
> Other things while on this topic.  We need to come up w/ a loading that brings on the
CMS fault that comes of a fragmented heap (CMS is non-compacting but apparently it will join
together free blocks to make bigger ones so there is some anti-fragmenting behavior going
on).  Apparently lots of large irregular sized items is the ticket. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message