hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saad Mufti <saad.mu...@gmail.com>
Subject Re: Slow sync cost
Date Wed, 27 Apr 2016 19:39:01 GMT
Thanks, that is a lot of useful information. I have a lot of things to look
at now in my cluster and API clients.

Cheers.

----
Saad


On Wed, Apr 27, 2016 at 3:28 PM, Bryan Beaudreault <bbeaudreault@hubspot.com
> wrote:

> We turned off auto-splitting by setting our region sizes to very large
> (100gb). We split them manually when they become too unwieldy from a
> compaction POV.
>
> We do use BufferedMutators in a number of places. They are pretty
> straightforward, and definitely improve performance. The only lessons
> learned there would be to use low buffer sizes. You'll get a lot of
> benefits from just 1MB size, but if you want to go higher than that, you
> should should aim for less than half of your G1GC region size. Anything
> larger than that is considered a humongous object, and has implications for
> garbage collection. The blog post I linked earlier goes into humongous
> objects:
>
> http://product.hubspot.com/blog/g1gc-fundamentals-lessons-from-taming-garbage-collection#HumongousObjects
> .
> We've seen them to be very bad for GC performance when many of them come in
> at once.
>
> So for us, most of our regionservers are 40gb+ heaps, which for that we use
> 32mb G1GC regions. With 32mb G1GC regions, we aim for all buffered mutators
> to use less than 16mb buffer sizes -- we even go further to limit it to
> around 10mb just to be safe. We also do the same for reads -- we try to
> limit all scanner and multiget responses to less than 10mb.
>
> We've created a dashboard with our internal monitoring system which shows
> the count of requests that we consider too large, for all applications (we
> have many 100s of deployed applications hitting these clusters). It's on
> the individual teams that own the applications to try to drive that count
> down to 0. We've built into HBase a detention queue (similar to quotas),
> where we can put any of these applications based on their username if they
> are doing something that is adversely affecting the rest of the system. For
> instance if they started spamming a lot of too large requests, or badly
> filtered scans, etc. In the detention queue, they use their own RPC
> handlers which we can aggressively limit or reject if need be to preserve
> the cluster.
>
> Hope this helps
>
> On Wed, Apr 27, 2016 at 2:54 PM Saad Mufti <saad.mufti@gmail.com> wrote:
>
> > Hi Bryan,
> >
> > In Hubspot do you use a single shared (per-JVM) BufferedMutator anywhere
> in
> > an attempt to get better performance? Any lessons learned from any
> > attempts? Has it hurt or helped?
> >
> > Also do you have any experience with write performance in conjunction
> with
> > auto-splitting activity kicking in, either with BufferedMutator or
> > separately with just direct Put's?
> >
> > Thanks.
> >
> > ----
> > Saad
> >
> >
> >
> >
> > On Wed, Apr 27, 2016 at 2:22 PM, Bryan Beaudreault <
> > bbeaudreault@hubspot.com
> > > wrote:
> >
> > > Hey Ted,
> > >
> > > Actually, gc_log_visualizer is open-sourced, I will ask the author to
> > > update the post with links:
> https://github.com/HubSpot/gc_log_visualizer
> > >
> > > The author was taking a foundational approach with this blog post. We
> do
> > > use ParallelGC for backend non-API deployables, such as kafka consumers
> > and
> > > long running daemons, etc. However, we treat HBase like our API's, in
> > that
> > > it must have low latency requests. So we use G1GC for HBase.
> > >
> > > Expect another blog post from another HubSpot engineer soon, with all
> the
> > > details on how we approached G1GC tuning for HBase. I will update this
> > list
> > > when it's published, and will put some pressure on that author to get
> it
> > > out there :)
> > >
> > > On Wed, Apr 27, 2016 at 2:01 PM Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > > > Bryan:
> > > > w.r.t. gc_log_visualizer, is there plan to open source it ?
> > > >
> > > > bq. while backend throughput will be better/cheaper with ParallelGC.
> > > >
> > > > Does the above mean that hbase servers are still using ParallelGC ?
> > > >
> > > > Thanks
> > > >
> > > > On Wed, Apr 27, 2016 at 7:39 AM, Bryan Beaudreault <
> > > > bbeaudreault@hubspot.com
> > > > > wrote:
> > > >
> > > > > We have 6 production clusters and all of them are tuned
> differently,
> > so
> > > > I'm
> > > > > not sure there is a setting I could easily give you. It really
> > depends
> > > on
> > > > > the usage.  One of our devs wrote a blog post on G1GC fundamentals
> > > > > recently. It's rather long, but could be worth a read:
> > > > >
> > > > >
> > > >
> > >
> >
> http://product.hubspot.com/blog/g1gc-fundamentals-lessons-from-taming-garbage-collection
> > > > >
> > > > > We will also have a blog post coming out in the next week or so
> that
> > > > talks
> > > > > specifically to tuning G1GC for HBase. I can update this thread
> when
> > > > that's
> > > > > available.
> > > > >
> > > > > On Tue, Apr 26, 2016 at 8:08 PM Saad Mufti <saad.mufti@gmail.com>
> > > wrote:
> > > > >
> > > > > > That is interesting. Would it be possible for you to share what
> GC
> > > > > settings
> > > > > > you ended up on that gave you the most predictable performance?
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > ----
> > > > > > Saad
> > > > > >
> > > > > >
> > > > > > On Tue, Apr 26, 2016 at 11:56 AM, Bryan Beaudreault <
> > > > > > bbeaudreault@hubspot.com> wrote:
> > > > > >
> > > > > > > We were seeing this for a while with our CDH5 HBase clusters
> too.
> > > We
> > > > > > > eventually correlated it very closely to GC pauses. Through
> > heavily
> > > > > > tuning
> > > > > > > our GC we were able to drastically reduce the logs, by
keeping
> > most
> > > > > GC's
> > > > > > > under 100ms.
> > > > > > >
> > > > > > > On Tue, Apr 26, 2016 at 6:25 AM Saad Mufti <
> saad.mufti@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > From what I can see in the source code, the default
is
> actually
> > > > even
> > > > > > > lower
> > > > > > > > at 100 ms (can be overridden with
> > > > > hbase.regionserver.hlog.slowsync.ms
> > > > > > ).
> > > > > > > >
> > > > > > > > ----
> > > > > > > > Saad
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Apr 26, 2016 at 3:13 AM, Kevin Bowling <
> > > > > > kevin.bowling@kev009.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I see similar log spam while system has reasonable
> > performance.
> > > > > Was
> > > > > > > the
> > > > > > > > > 250ms default chosen with SSDs and 10ge in mind
or
> something?
> > > I
> > > > > > guess
> > > > > > > > I'm
> > > > > > > > > surprised a sync write several times through
JVMs to 2
> remote
> > > > > > datanodes
> > > > > > > > > would be expected to consistently happen that
fast.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > >
> > > > > > > > > On Mon, Apr 25, 2016 at 12:18 PM, Saad Mufti
<
> > > > saad.mufti@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > In our large HBase cluster based on CDH
5.5 in AWS, we're
> > > > > > constantly
> > > > > > > > > seeing
> > > > > > > > > > the following messages in the region server
logs:
> > > > > > > > > >
> > > > > > > > > > 2016-04-25 14:02:55,178 INFO
> > > > > > > > > > org.apache.hadoop.hbase.regionserver.wal.FSHLog:
Slow
> sync
> > > > cost:
> > > > > > 258
> > > > > > > > ms,
> > > > > > > > > > current pipeline:
> > > > > > > > > > [DatanodeInfoWithStorage[10.99.182.165:50010
> > > > > > > > > > ,DS-281d4c4f-23bd-4541-bedb-946e57a0f0fd,DISK],
> > > > > > > > > > DatanodeInfoWithStorage[10.99.182.236:50010
> > > > > > > > > > ,DS-f8e7e8c9-6fa0-446d-a6e5-122ab35b6f7c,DISK],
> > > > > > > > > > DatanodeInfoWithStorage[10.99.182.195:50010
> > > > > > > > > > ,DS-3beae344-5a4a-4759-ad79-a61beabcc09d,DISK]]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > These happen regularly while HBase appear
to be operating
> > > > > normally
> > > > > > > with
> > > > > > > > > > decent read and write performance. We do
have occasional
> > > > > > performance
> > > > > > > > > > problems when regions are auto-splitting,
and at first I
> > > > thought
> > > > > > this
> > > > > > > > was
> > > > > > > > > > related but now I se it happens all the
time.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Can someone explain what this means really
and should we
> be
> > > > > > > concerned?
> > > > > > > > I
> > > > > > > > > > tracked down the source code that outputs
it in
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
> > > > > > > > > >
> > > > > > > > > > but after going through the code I think
I'd need to know
> > > much
> > > > > more
> > > > > > > > about
> > > > > > > > > > the code to glean anything from it or the
associated JIRA
> > > > ticket
> > > > > > > > > > https://issues.apache.org/jira/browse/HBASE-11240.
> > > > > > > > > >
> > > > > > > > > > Also, what is this "pipeline" the ticket
and code talks
> > > about?
> > > > > > > > > >
> > > > > > > > > > Thanks in advance for any information and/or
> clarification
> > > > anyone
> > > > > > can
> > > > > > > > > > provide.
> > > > > > > > > >
> > > > > > > > > > ----
> > > > > > > > > >
> > > > > > > > > > Saad
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message