Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of derek@klout.com designates
 74.125.149.76 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAO83RbUPYNMji6n4=HLsk8KMbP84tT6fOp6Q6F-RsVej1UnF=w@mail.gmail.com>
References: 
 <CAMZBrq0xf1Ov-3aD576NP6r3WmmMqjZais=RS9qA7qOBa97uhA@mail.gmail.com>
	<CB042852.1F479%doug.meil@explorysmedical.com>
	<CAMZBrq1YauAYKyZxg2ib+aKKgYvM2n_RLUcLgAY_pd1LaXgHdQ@mail.gmail.com>
	<CAO83RbUPYNMji6n4=HLsk8KMbP84tT6fOp6Q6F-RsVej1UnF=w@mail.gmail.com>
Date: Tue, 6 Dec 2011 18:30:56 -0800
Message-ID: 
 <CAMZBrq05KmSE4okcjhVk3_2WfU_SPrTGAAkOxnw4=D4yAGPA6A@mail.gmail.com>
Subject: Re: Help Tuning HBase GC Settings
From: Derek Wollenstein <derek@klout.com>
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=14dae934048b75faaa04b37757f6

--14dae934048b75faaa04b37757f6
Content-Type: text/plain; charset=ISO-8859-1

Jacques --
    The main problem we really see is that after "a while"(yes, this is
unspecific) we start seeing "random" (e.g. uncorrelated with the key being
retrieved, but correlated with time) timeouts (>500ms) retrieving small (<
200 byte) rows from hbase.  The interesting thing is that even a test case
that seems like it should be very cacheable (retrieving literally the same
row, over and over again) will have the same responsiveness.  I was just
suspecting that if we're running 6 GC/second (which is what this hits at
"peak GC"), each at 25-35ms each, that could certainly explain a decent
amount of unavailability.

1) We don't yet have MSLAB enabled, primarily because we're trying to turn
on settings changes one at a time.  I also wasn't clear if these symptoms
were consistent with the ones mslab was trying to fix, although I have
considered it.
2) I don't think a four minute pause is good, but I don't think a 120
second pause is that great either, so it's something we'd have to worry
abou.
3) Yes, we have 8 physical cores (16 HT cores), so that shouldn't be the
problem.
4) Off heap block cache does seem like a cool feature, I will keep it in
mind.

The main reason I was asking is that we really did see this sudden "jump"
in GC activity over time, and I was hoping it indicated something like
setting the block cache setting too high relative to the heap size...


On Tue, Dec 6, 2011 at 6:14 PM, Jacques <whshub@gmail.com> wrote:

> I'll start with clearly stating that I'm not a gc specialist.  I spend a
> bunch of time with it but forget all the things I learn once I solve my
> problems...
>
> What exactly is the problem here?  Does the server become unresponsive
> after 16 hours?  What happens in the HBase logs for that regionserver?  I
> believe that you're seeing frequent runs likely because of fragmentation of
> your heap along with your XX:CMSInitiatingOccupancyFraction of 60%.  These
> would be a precursor to a full gc which would likely actually take the
> server down.
>
> A few quick thoughts that you may or may not have run across:
>
>   - MSLAB is your friend if you haven't been using it already. See more
>   here:
>
> http://www.cloudera.com/blog/2011/03/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-3/
>   - I can't remember exactly but I feel like the number that used to be
>   quoted by some was 10 seconds per gb for a full gc.  So you're looking
> at a
>   full gc of over ~4 minutes with that size heap once you do arrive at a
> full
>   gc.
>   - If you're okay having unresponsive regions for 4+minutes, you'd also
>   want to increase your ZooKeeper timeout to allow for it.
>   - If I remember correctly, at a recent presentation yfrog was utilizing
>   heaps as high as 64gb but that most people thought that was very risky
> and
>   you should run much lower.  The 16gb that Doug quotes is more what people
>   seemed to use.
>   - I haven't heard about most people setting GC threads specifically.
>    Since you set the gc threads at 6, I assume you have at least 6 true
> cores?
>
> We used to run our regionservers up around 24gb but had constant problems.
>  Ultimately, we settled down at 12gb heaps with mslab enabled (and at 4mb
> as opposed to the traditional 2mb default due to our cell sizes).
>
> Also, off heap block cache is coming up in the 0.92 release (
> https://issues.apache.org/jira/browse/HBASE-4027).  That should
> theoretically allow you to use a bunch more memory for the block cache
> without the same problems.  Others who are more familiar with the feature
> would be able to speak better to real world results...
>
>
> Jacques
>
>
>
>
> On Tue, Dec 6, 2011 at 5:30 PM, Derek Wollenstein <derek@klout.com> wrote:
>
> > I will take a look at lowering this;  Unfortunately I'm inheriting
> existing
> > settings and trying to be as conservative as possible when making
> changes.
> >  I can definitely try lowering the memory -- I've gotten mixed messages
> on
> > how much to allocate to the HBase heap.  I'll start taking a look at
> moving
> > both of these settings down and see how it affects performance (and
> trying
> > to use https://github.com/brianfrankcooper/YCSB/wiki for testing).
>  Thanks
> > for the suggestion.
> >
> > On Tue, Dec 6, 2011 at 5:20 PM, Doug Meil <doug.meil@explorysmedical.com
> > >wrote:
> >
> > >
> > > There are others that know this GC issues better than myself, but
> setting
> > > hfile.block.cache.size to .5 seems a bit aggressive to me.  That's 50%
> of
> > > the heap right there.
> > >
> > > Also, the issue with setting the max-heap to 24Gb is that whenever a
> full
> > > GC is required on a heap that size, it's a killer.  Folks at recent
> Hbase
> > > hackathons were talking about not going higher than 16Gb for RS.
> > >
> > >
> > >
> > >
> > >
> > > On 12/6/11 8:10 PM, "Derek Wollenstein" <derek@klout.com> wrote:
> > >
> > > >I've been working on improving GC settings for HBase RegionServer
> > > >instances, and I seem to have run into a bit of a dead end.
> > > >
> > > >Basically I've been trying to tune GC settings and memory settings
> > > >appropriately, but I keep on having my server reach something like GC
> > > >Death.
> > > >
> > > >My current memory settings are
> > > >HEAPSIZE=24576
> > > >-ea -Xmx24576 -Xms24576 -XX:+UseConcMarkSweepGC -XX:+UseCompressedOops
> > > >-XX:NewSize=192m -XX:MaxNewSize=192m
> > > >-XX:CMSInitiatingOccupancyFraction=60 -verbose:gc -XX:+PrintGCDetails
> > > >-XX:+PrintGCDateStamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log
> > > >-XX:ParallelGCThreads=6
> > > >
> > > >We've also set hfile.block.cache.size to 0.5 ( believing that
> incresing
> > > >the
> > > >space for cache should improve performance;  I'm willing to accept
> that
> > > >this could be causing problems, I just haven't seen this reported)
> > > >
> > > >Basically as the service is "warming up", we see reasonable GC Log
> > > >timings.
> > > >
> > > >...
> > > >2011-12-06T10:19:14.834+0000: 782.473: [GC [1 CMS-initial-mark:
> > > >2225311K(3705244K)] 2237042K(3882204K), 0.0028630 secs] [Times:
> > user=0.00
> > > >sys=0.00, real=0.00 secs]
> > > >2011-12-06T10:19:14.837+0000: 782.476: [CMS-concurrent-mark-start]
> > > >2011-12-06T10:19:15.523+0000: 783.162: [CMS-concurrent-mark:
> 0.686/0.686
> > > >secs] [Times: user=1.98 sys=0.05, real=0.69 secs]
> > > >2011-12-06T10:19:15.523+0000: 783.162: [CMS-concurrent-preclean-start]
> > > >2011-12-06T10:19:15.533+0000: 783.172: [CMS-concurrent-preclean:
> > > >0.010/0.010 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
> > > >2011-12-06T10:19:15.534+0000: 783.172:
> > > >[CMS-concurrent-abortable-preclean-start]
> > > >2011-12-06T10:19:15.742+0000: 783.381: [GC 783.381: [ParNew:
> > > >168978K->9067K(176960K), 0.0076540 secs] 2394290K->2235970K(3882204K),
> > > >0.0077920 secs] [Times: user=0.05 sys=0.00, real=0.00 secs]
> > > >2011-12-06T10:19:16.351+0000: 783.990:
> > [CMS-concurrent-abortable-preclean:
> > > >0.123/0.817 secs] [Times: user=0.87 sys=0.06, real=0.81 secs]
> > > >2011-12-06T10:19:16.352+0000: 783.990: [GC[YG occupancy: 97548 K
> (176960
> > > >K)]783.990: [Rescan (parallel) , 0.0153180 secs]784.006: [weak refs
> > > >processing, 0.0000170 secs] [1 CMS-remark: 2226903K(3705244K)]
> > > >2324451K(3882204K), 0.0154470 secs] [Times: user=0.09 sys=0.00,
> > real=0.02
> > > >secs]
> > > >...
> > > >This seems to continue for about for about 16 hours
> > > >But as we hit some critical time, we end up stuck running multiple
> young
> > > >GC/second.  Note that they at appear to be successful (if I'm reading
> > the
> > > >logs correctly)
> > > >011-12-06T23:49:42.132+0000: 49409.770: [GC 49409.771: [ParNew:
> > > >166477K->7992K(176960K), 0.0390800 secs]
> > 11642960K->11486691K(19297180K),
> > > >0.0392470 secs] [Times: user=0.22 sys=0.00, real=0.04 secs]
> > > >2011-12-06T23:49:42.522+0000: 49410.161:
> > > >[CMS-concurrent-abortable-preclean: 0.247/0.589 secs] [Times:
> user=1.04
> > > >sys=0.02, real=0.58 secs]
> > > >2011-12-06T23:49:42.523+0000: 49410.162: [GC[YG occupancy: 149306 K
> > > >(176960
> > > >K)]49410.162: [Rescan (parallel) , 0.0314250 secs]49410.193: [weak
> refs
> > > >processing, 0.0000890 secs] [1 CMS-remark: 11478698K(1
> > > >9120220K)] 11628005K(19297180K), 0.0316410 secs] [Times: user=0.17
> > > >sys=0.01, real=0.03 secs]
> > > >2011-12-06T23:49:42.555+0000: 49410.194: [CMS-concurrent-sweep-start]
> > > >2011-12-06T23:49:42.597+0000: 49410.236: [GC 49410.236: [ParNew:
> > > >165304K->7868K(176960K), 0.0405890 secs]
> > 11643677K->11487303K(19297180K),
> > > >0.0407690 secs] [Times: user=0.23 sys=0.00, real=0.04 secs]
> > > >2011-12-06T23:49:43.048+0000: 49410.687: [GC 49410.687: [ParNew:
> > > >165180K->6485K(176960K), 0.0389860 secs]
> > 11027946K->10869726K(19297180K),
> > > >0.0392000 secs] [Times: user=0.23 sys=0.00, real=0.04 secs]
> > > >2011-12-06T23:49:43.181+0000: 49410.819: [CMS-concurrent-sweep:
> > > >0.542/0.625
> > > >secs] [Times: user=1.73 sys=0.02, real=0.62 secs]
> > > >2011-12-06T23:49:43.181+0000: 49410.819: [CMS-concurrent-reset-start]
> > > >2011-12-06T23:49:43.232+0000: 49410.870: [CMS-concurrent-reset:
> > > >0.051/0.051
> > > >secs] [Times: user=0.10 sys=0.00, real=0.05 secs]
> > > >2011-12-06T23:49:43.481+0000: 49411.120: [GC 49411.120: [ParNew:
> > > >163797K->7150K(176960K), 0.0409170 secs]
> > 10380339K->10224698K(19297180K),
> > > >0.0410870 secs] [Times: user=0.24 sys=0.00, real=0.04 secs]
> > > >2011-12-06T23:49:43.920+0000: 49411.559: [GC 49411.559: [ParNew:
> > > >164462K->8178K(176960K), 0.0394640 secs]
> > 10382010K->10226321K(19297180K),
> > > >0.0396290 secs] [Times: user=0.22 sys=0.00, real=0.04 secs]
> > > >2011-12-06T23:49:44.355+0000: 49411.994: [GC 49411.994: [ParNew:
> > > >165490K->8303K(176960K), 0.0367330 secs]
> > 10383633K->10227244K(19297180K),
> > > >0.0368970 secs] [Times: user=0.22 sys=0.00, real=0.03 secs]
> > > >2011-12-06T23:49:44.785+0000: 49412.424: [GC 49412.424: [ParNew:
> > > >165615K->10439K(176960K), 0.0398080 secs]
> > 10384556K->10229598K(19297180K),
> > > >0.0399870 secs] [Times: user=0.23 sys=0.00, real=0.04 secs]
> > > >2011-12-06T23:49:45.225+0000: 49412.864: [GC 49412.864: [ParNew:
> > > >167751K->13171K(176960K), 0.0393970 secs]
> > 10386910K->10233071K(19297180K),
> > > >0.0395730 secs] [Times: user=0.23 sys=0.00, real=0.04 secs]
> > > >...
> > > >Note that we are doing about 2 YGC/second (about 80ms). I believe that
> > > >this
> > > >situation represents us using up some undesirable amount of memory
> > > >resources, but I can't find any particular reason why we're going from
> > > >using a reasonable amount of GC to using GC constantly.   One theory
> is
> > > >that this is the result of something in the young generation being
> > > >promoted?  But I'm not really sure what would be causing
> > > >this pathological behavior.  Any ideas are alternative places to look
> > > >would
> > > >be greatly appreciated.
> > > >
> > > >--Derek
> > >
> > >
> > >
> >
>

--14dae934048b75faaa04b37757f6--