hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dejan Menges <dejan.men...@gmail.com>
Subject Re: Stochastic Balancer by tables
Date Fri, 19 Jun 2015 07:47:25 GMT
Just have to say that hbase.master.loadbalance.bytable saved us after we
discovered it. In our case we had to set it manually to true, and then it
was easy to catch hot spotting on unusually large regions and handle it.

Btw +1 for HBASE-13013, had to say it, something that makes me starting
upgrading our HDP stack on Monday morning.

On Thu, Jun 18, 2015 at 11:04 PM, Bryan Beaudreault <
bbeaudreault@hubspot.com> wrote:

> Just had to say, https://issues.apache.org/jira/browse/HBASE-13103 looks
> *AWESOME*
>
> On Thu, Jun 18, 2015 at 5:00 PM Mikhail Antonov <olorinbant@gmail.com>
> wrote:
>
> > Yeah, I could see 2 reasons for remaining few regions to take
> > unproportionally long time - 1) those regions are unproportionally
> > large (you should be able to quickly confirm it) and 2) they happened
> > to be hosted on really slow/overloaded machine(s). #1 seems far more
> > likely to me.
> >
> > And as Nick said, there's ongoing effort to provide exactly what
> > you've described - centralized periodic analysis of region sizes and
> > equalization as needed (somewhat complementary to balancing), and any
> > feedback (especially from folks experiencing real issues with unequal
> > region sizes) is much appreciated.
> >
> > -Mikhail
> >
> > On Thu, Jun 18, 2015 at 10:07 AM, Nick Dimiduk <ndimiduk@gmail.com>
> wrote:
> > > If you're interested in region size balancing, please have a look at
> > > https://issues.apache.org/jira/browse/HBASE-13103 . Please provide
> > feedback
> > > as we're hoping to have an early version available in 1.2.
> > >
> > > Which reminds me, I owe Mikhail another review...
> > >
> > > On Thu, Jun 18, 2015 at 9:39 AM, Elliott Clark <eclark@apache.org>
> > wrote:
> > >
> > >> The balancer is not responsible fore region size decisions. The
> > balancer is
> > >> only responsible for deciding which regionservers should host which
> > >> regions.
> > >> Splits are determined by data size of a region. See max store file
> size.
> > >>
> > >> On Thu, Jun 18, 2015 at 7:50 AM, Nasron Cheong <nasron@gmail.com>
> > wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > I've noticed there are two settings available when using the HBase
> > >> balancer
> > >> > (specifically the default stochastic balancer)
> > >> >
> > >> > hbase.master.balancer.stochastic.tableSkewCost
> > >> >
> > >> > hbase.master.loadbalance.bytable
> > >> >
> > >> > How do these two settings relate? The documentation indicates when
> > using
> > >> > the stochastic balancer that 'bytable' should be set to false?
> > >> >
> > >> > Our deployment relies on very few, very large tables, and I've
> noticed
> > >> bad
> > >> > distribution when accessing some of the tables. E.g. there are 443
> > >> regions
> > >> > for a single table, but when doing a MR job over a full scan of the
> > >> table,
> > >> > the first 426 regions scan quickly (minutes), but the remaining 17
> > >> regions
> > >> > take significantly longer (hours)
> > >> >
> > >> > My expectation is to have the balancer equalize the size of the
> > regions
> > >> for
> > >> > each table.
> > >> >
> > >> > Thanks!
> > >> >
> > >> > - Nasron
> > >> >
> > >>
> >
> >
> >
> > --
> > Thanks,
> > Michael Antonov
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message