hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nasron Cheong <nas...@gmail.com>
Subject Re: Stochastic Balancer by tables
Date Fri, 19 Jun 2015 19:31:01 GMT
Hi Mikhail,

Something like N * number of region servers, or something different?


It's pretty much this, but additionally also trying to ensure that in best
case we can fit the whole column in memory per region server.

Another restriction is that since # MR map tasks == # of regions, keeping
the number of regions reasonable is another restriction, as other MR tasks
may need the slots.

Our use case is really time based data, so as newer data comes in it might
be better to relegate older data to larger regions? Not sure if that's
something we should consider.

I think this something which we definitely need to have in shell/web
> (separate question of how to filter/page it if there are many
> thousands regions). But that's probably different discussion. I can
> open jira for that.


While you're at it, we use long row keys in order to take advantage of fast
start/stop filtering with Scanners, but it makes the current UI listing of
region info unreadable. Another useful piece of info is some indication of
a region's locality, request hits on the region, bloom filter hits, etc.
I'm sure theres all kinds of things that people want. :)

I was trying to get Hannibal working but it has some issues running on our
cluster, in order to get more visibility. I'm sure there are more ideas
there as well.

Thanks!

- Nasron


On Fri, Jun 19, 2015 at 3:16 PM, Mikhail Antonov <olorinbant@gmail.com>
wrote:

> Nasron,
>
> Yeah, looks like you pretty much implemented pieces of logic being
> discussed in HBASE-13103 :) So that's interesting, thanks for telling
> us. Wondering, how did you estimate the number of desired regions?
> Something like N * number of region servers, or something different?
>
> "I couldn't find a tool to show regions and their sizes, for a specific
> table, so ended up writing one."
>
> I think this something which we definitely need to have in shell/web
> (separate question of how to filter/page it if there are many
> thousands regions). But that's probably different discussion. I can
> open jira for that.
>
> On Fri, Jun 19, 2015 at 9:19 AM, Nick Dimiduk <ndimiduk@gmail.com> wrote:
> > On Fri, Jun 19, 2015 at 7:45 AM, Nasron Cheong <nasron@gmail.com> wrote:
> >
> >> I couldn't find a tool to show regions and their sizes, for a specific
> >> table, so ended up writing one.
> >>
> >
> > Nasron,
> >
> > Would you mind having a look at the patch/RB on HBASE-13103? Does the API
> > pair RegionNormalizer/Normalization plan look like a reasonable harness
> for
> > you to hang your custom tool onto? Just like the balancer, it's designed
> to
> > be extensible with different normalization strategies.
> >
> > On Fri, Jun 19, 2015 at 3:47 AM, Dejan Menges <dejan.menges@gmail.com>
> >> wrote:
> >>
> >> > Just have to say that hbase.master.loadbalance.bytable saved us after
> we
> >> > discovered it. In our case we had to set it manually to true, and
> then it
> >> > was easy to catch hot spotting on unusually large regions and handle
> it.
> >> >
> >> > Btw +1 for HBASE-13013, had to say it, something that makes me
> starting
> >> > upgrading our HDP stack on Monday morning.
> >> >
> >> > On Thu, Jun 18, 2015 at 11:04 PM, Bryan Beaudreault <
> >> > bbeaudreault@hubspot.com> wrote:
> >> >
> >> > > Just had to say, https://issues.apache.org/jira/browse/HBASE-13103
> >> looks
> >> > > *AWESOME*
> >> > >
> >> > > On Thu, Jun 18, 2015 at 5:00 PM Mikhail Antonov <
> olorinbant@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Yeah, I could see 2 reasons for remaining few regions to take
> >> > > > unproportionally long time - 1) those regions are unproportionally
> >> > > > large (you should be able to quickly confirm it) and 2) they
> happened
> >> > > > to be hosted on really slow/overloaded machine(s). #1 seems far
> more
> >> > > > likely to me.
> >> > > >
> >> > > > And as Nick said, there's ongoing effort to provide exactly what
> >> > > > you've described - centralized periodic analysis of region sizes
> and
> >> > > > equalization as needed (somewhat complementary to balancing),
and
> any
> >> > > > feedback (especially from folks experiencing real issues with
> unequal
> >> > > > region sizes) is much appreciated.
> >> > > >
> >> > > > -Mikhail
> >> > > >
> >> > > > On Thu, Jun 18, 2015 at 10:07 AM, Nick Dimiduk <
> ndimiduk@gmail.com>
> >> > > wrote:
> >> > > > > If you're interested in region size balancing, please have
a
> look
> >> at
> >> > > > > https://issues.apache.org/jira/browse/HBASE-13103 . Please
> provide
> >> > > > feedback
> >> > > > > as we're hoping to have an early version available in 1.2.
> >> > > > >
> >> > > > > Which reminds me, I owe Mikhail another review...
> >> > > > >
> >> > > > > On Thu, Jun 18, 2015 at 9:39 AM, Elliott Clark <
> eclark@apache.org>
> >> > > > wrote:
> >> > > > >
> >> > > > >> The balancer is not responsible fore region size decisions.
The
> >> > > > balancer is
> >> > > > >> only responsible for deciding which regionservers should
host
> >> which
> >> > > > >> regions.
> >> > > > >> Splits are determined by data size of a region. See
max store
> file
> >> > > size.
> >> > > > >>
> >> > > > >> On Thu, Jun 18, 2015 at 7:50 AM, Nasron Cheong <
> nasron@gmail.com>
> >> > > > wrote:
> >> > > > >>
> >> > > > >> > Hi,
> >> > > > >> >
> >> > > > >> > I've noticed there are two settings available when
using the
> >> HBase
> >> > > > >> balancer
> >> > > > >> > (specifically the default stochastic balancer)
> >> > > > >> >
> >> > > > >> > hbase.master.balancer.stochastic.tableSkewCost
> >> > > > >> >
> >> > > > >> > hbase.master.loadbalance.bytable
> >> > > > >> >
> >> > > > >> > How do these two settings relate? The documentation
indicates
> >> when
> >> > > > using
> >> > > > >> > the stochastic balancer that 'bytable' should be
set to
> false?
> >> > > > >> >
> >> > > > >> > Our deployment relies on very few, very large tables,
and
> I've
> >> > > noticed
> >> > > > >> bad
> >> > > > >> > distribution when accessing some of the tables.
E.g. there
> are
> >> 443
> >> > > > >> regions
> >> > > > >> > for a single table, but when doing a MR job over
a full scan
> of
> >> > the
> >> > > > >> table,
> >> > > > >> > the first 426 regions scan quickly (minutes), but
the
> remaining
> >> 17
> >> > > > >> regions
> >> > > > >> > take significantly longer (hours)
> >> > > > >> >
> >> > > > >> > My expectation is to have the balancer equalize
the size of
> the
> >> > > > regions
> >> > > > >> for
> >> > > > >> > each table.
> >> > > > >> >
> >> > > > >> > Thanks!
> >> > > > >> >
> >> > > > >> > - Nasron
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Thanks,
> >> > > > Michael Antonov
> >> > > >
> >> > >
> >> >
> >>
>
>
>
> --
> Thanks,
> Michael Antonov
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message