hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qiang Tian <tian...@gmail.com>
Subject Re: Help: RegionTooBusyException: failed to get a lock in 60000 ms
Date Tue, 23 Sep 2014 03:35:11 GMT
Hello, I happened to got balancer related issues 2 months ago and looked at
that part, below is a summary:
1)by default, hbase balancer(StochasticLoadBalancer by default) does not
balance regions per table. i.e. all regions are considered as 1 table.  so
if you have many tables, especially some tables have empty regions, you
probably get unbalanced, the balancer probably not triggered at all.
this is got from code inspection, my problem failed to be reproduced later.
but it proved that deleting empty regions can trigger balancer correctly
and make regions well balanced.

2)there are some other reasons that balancer are not triggered. see
HMaster#balance. turn on debug can see related messages in master log. in
my case, it is not triggered because there are regions in transition:
LOG.debug("Not running balancer because " + regionsInTransition.size() +
          " region(s) in transition: " +
org.apache.commons.lang.StringUtils.
            abbreviate(regionsInTransition.toString(), 256));

the cause can be found in regionserver log file.

3)per-table balance can be set by "hbase.master.loadbalance.bytable",
however it looks not a good option when you have many tables - the master
will issue balance call for each table, one by one.

4)split region follows normal balancer process. so if you have issue in #1,
split does not help balance.  it looks pre-split at table creation is fine,
which uses round-robin assignment.



On Tue, Sep 23, 2014 at 2:12 AM, Bharath Vissapragada <bharathv@cloudera.com
> wrote:

> https://issues.apache.org/jira/browse/HBASE-11368 related to the original
> issue too.
>
> On Mon, Sep 22, 2014 at 10:18 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > As you noted in the FIXME, there're some factors which should be tackled
> by
> > balancer / assignment manager.
> >
> > Please continue digging up master log so that we can find the cause for
> > balancer not fulfilling your goal.
> >
> > Cheers
> >
> > On Mon, Sep 22, 2014 at 10:09 AM, Jianshi Huang <jianshi.huang@gmail.com
> >
> > wrote:
> >
> > > Ok, I fixed this by manually reassign region servers to newly created
> > ones.
> > >
> > >   def reassignRegionServer(admin: HBaseAdmin, regions:
> Seq[HRegionInfo],
> > > regionServers: Seq[ServerName]): Unit = {
> > >     val rand = new Random()
> > >     regions.foreach { r =>
> > >       val idx = rand.nextInt(regionServers.size)
> > >       val server = regionServers(idx)
> > >       // FIXME: what if selected region server is dead?
> > >         admin.move(r.getEncodedNameAsBytes,
> > > server.getServerName.getBytes("UTF8"))
> > >     }
> > >   }
> > >
> > > er...
> > >
> > > Jianshi
> > >
> > > On Tue, Sep 23, 2014 at 12:24 AM, Jianshi Huang <
> jianshi.huang@gmail.com
> > >
> > > wrote:
> > >
> > > > Hmm...any workaround? I only want to do this:
> > > >
> > > > Rebalance the new regions *evenly* to all servers after manually
> adding
> > > > splits, so later bulk insertions won't cause contention.
> > > >
> > > > P.S.
> > > > Looks like two of the region servers which had majority of the
> regions
> > > > were down during Major compaction... I guess it had too much data.
> > > >
> > > >
> > > > Jianshi
> > > >
> > > > On Tue, Sep 23, 2014 at 12:13 AM, Jianshi Huang <
> > jianshi.huang@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Yes, I have access to Master UI, however logs/*.log cannot be opened
> > or
> > > >> downloaded, must be some security restrictions in the proxy...
> > > >>
> > > >> Jianshi
> > > >>
> > > >> On Tue, Sep 23, 2014 at 12:06 AM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > > >>
> > > >>> Do you have access to Master UI ?
> > > >>>
> > > >>> <master-address>:60010/logs/ would show you list of log
files.
> > > >>>
> > > >>> The you can view
> > > <master-address>:60010/logs/hbase-<user>-master-XXX.log
> > > >>>
> > > >>> Cheers
> > > >>>
> > > >>> On Mon, Sep 22, 2014 at 9:00 AM, Jianshi Huang <
> > > jianshi.huang@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>> > Ah... I don't have access to HMaster logs... I need to ask
the
> > admin.
> > > >>> >
> > > >>> > Jianshi
> > > >>> >
> > > >>> > On Mon, Sep 22, 2014 at 11:49 PM, Ted Yu <yuzhihong@gmail.com>
> > > wrote:
> > > >>> >
> > > >>> > > bq. assign per-table balancer class
> > > >>> > >
> > > >>> > > No that I know of.
> > > >>> > > Can you pastebin master log involving output from balancer
?
> > > >>> > >
> > > >>> > > Cheers
> > > >>> > >
> > > >>> > > On Mon, Sep 22, 2014 at 8:29 AM, Jianshi Huang <
> > > >>> jianshi.huang@gmail.com>
> > > >>> > > wrote:
> > > >>> > >
> > > >>> > > > Hi Ted,
> > > >>> > > >
> > > >>> > > > I moved setBalancerRunning before balancer and
run them
> twice.
> > > >>> However
> > > >>> > I
> > > >>> > > > still got highly skewed region distribution.
> > > >>> > > >
> > > >>> > > > I guess it's because of the StochasticLoadBalancer,
can I
> > assign
> > > >>> > > per-table
> > > >>> > > > balancer class in HBase?
> > > >>> > > >
> > > >>> > > >
> > > >>> > > > Jianshi
> > > >>> > > >
> > > >>> > > > On Mon, Sep 22, 2014 at 9:50 PM, Ted Yu <yuzhihong@gmail.com
> >
> > > >>> wrote:
> > > >>> > > >
> > > >>> > > > > admin.setBalancerRunning() call should precede
the call to
> > > >>> > > > > admin.balancer().
> > > >>> > > > >
> > > >>> > > > > You can inspect master log to see whether
regions are being
> > > >>> moved off
> > > >>> > > the
> > > >>> > > > > heavily loaded server.
> > > >>> > > > >
> > > >>> > > > > Cheers
> > > >>> > > > >
> > > >>> > > > > On Mon, Sep 22, 2014 at 1:42 AM, Jianshi Huang
<
> > > >>> > > jianshi.huang@gmail.com>
> > > >>> > > > > wrote:
> > > >>> > > > >
> > > >>> > > > > > Hi Ted and others,
> > > >>> > > > > >
> > > >>> > > > > > I did the following after adding splits
(without data) to
> > my
> > > >>> table,
> > > >>> > > > > however
> > > >>> > > > > > the region is still very imbalanced (one
region server
> has
> > > 221
> > > >>> > > regions
> > > >>> > > > > and
> > > >>> > > > > > other 50 region servers have about 4~8
regions each).
> > > >>> > > > > >
> > > >>> > > > > >       admin.balancer()
> > > >>> > > > > >       admin.setBalancerRunning(true,
true)
> > > >>> > > > > >
> > > >>> > > > > > The balancer class in my HBase cluster
is
> > > >>> > > > > >
> > > >>> > > > > >
> > > org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer
> > > >>> > > > > >
> > > >>> > > > > > So, is this behavior expected? Can I
assign different
> > > balancer
> > > >>> > class
> > > >>> > > to
> > > >>> > > > > my
> > > >>> > > > > > tables (I don't have HBase admin permission)?
Which one
> > > should
> > > >>> I
> > > >>> > use?
> > > >>> > > > > >
> > > >>> > > > > > I just want HBase to evenly distribute
the regions even
> > they
> > > >>> don't
> > > >>> > > have
> > > >>> > > > > > data (that's the purpose of pre-split
I think).
> > > >>> > > > > >
> > > >>> > > > > >
> > > >>> > > > > > Jianshi
> > > >>> > > > > >
> > > >>> > > > > >
> > > >>> > > > > > On Sat, Sep 6, 2014 at 12:45 AM, Ted
Yu <
> > yuzhihong@gmail.com
> > > >
> > > >>> > wrote:
> > > >>> > > > > >
> > > >>> > > > > > > Yes. See the following method in
HBaseAdmin:
> > > >>> > > > > > >
> > > >>> > > > > > >   public boolean balancer()
> > > >>> > > > > > >
> > > >>> > > > > > >
> > > >>> > > > > > > On Fri, Sep 5, 2014 at 9:38 AM,
Jianshi Huang <
> > > >>> > > > jianshi.huang@gmail.com
> > > >>> > > > > >
> > > >>> > > > > > > wrote:
> > > >>> > > > > > >
> > > >>> > > > > > > > Thanks Ted!
> > > >>> > > > > > > >
> > > >>> > > > > > > > Didn't know I still need to
run the 'balancer'
> command.
> > > >>> > > > > > > >
> > > >>> > > > > > > > Is there a way to do it programmatically?
> > > >>> > > > > > > >
> > > >>> > > > > > > > Jianshi
> > > >>> > > > > > > >
> > > >>> > > > > > > >
> > > >>> > > > > > > >
> > > >>> > > > > > > > On Sat, Sep 6, 2014 at 12:29
AM, Ted Yu <
> > > >>> yuzhihong@gmail.com>
> > > >>> > > > wrote:
> > > >>> > > > > > > >
> > > >>> > > > > > > > > After splitting the region,
you may need to run
> > > balancer
> > > >>> to
> > > >>> > > > spread
> > > >>> > > > > > the
> > > >>> > > > > > > > new
> > > >>> > > > > > > > > regions out.
> > > >>> > > > > > > > >
> > > >>> > > > > > > > > Cheers
> > > >>> > > > > > > > >
> > > >>> > > > > > > > >
> > > >>> > > > > > > > > On Fri, Sep 5, 2014 at
9:25 AM, Jianshi Huang <
> > > >>> > > > > > jianshi.huang@gmail.com
> > > >>> > > > > > > >
> > > >>> > > > > > > > > wrote:
> > > >>> > > > > > > > >
> > > >>> > > > > > > > > > Hi Shahab,
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > > I see, that seems
to be the right way...
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > > On Sat, Sep 6, 2014
at 12:21 AM, Shahab Yunus <
> > > >>> > > > > > > shahab.yunus@gmail.com>
> > > >>> > > > > > > > > > wrote:
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > > > Shahab
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > > --
> > > >>> > > > > > > > > > Jianshi Huang
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > > > LinkedIn: jianshi
> > > >>> > > > > > > > > > Twitter: @jshuang
> > > >>> > > > > > > > > > Github & Blog:
http://huangjs.github.com/
> > > >>> > > > > > > > > >
> > > >>> > > > > > > > >
> > > >>> > > > > > > >
> > > >>> > > > > > > >
> > > >>> > > > > > > >
> > > >>> > > > > > > > --
> > > >>> > > > > > > > Jianshi Huang
> > > >>> > > > > > > >
> > > >>> > > > > > > > LinkedIn: jianshi
> > > >>> > > > > > > > Twitter: @jshuang
> > > >>> > > > > > > > Github & Blog: http://huangjs.github.com/
> > > >>> > > > > > > >
> > > >>> > > > > > >
> > > >>> > > > > >
> > > >>> > > > > >
> > > >>> > > > > >
> > > >>> > > > > > --
> > > >>> > > > > > Jianshi Huang
> > > >>> > > > > >
> > > >>> > > > > > LinkedIn: jianshi
> > > >>> > > > > > Twitter: @jshuang
> > > >>> > > > > > Github & Blog: http://huangjs.github.com/
> > > >>> > > > > >
> > > >>> > > > >
> > > >>> > > >
> > > >>> > > >
> > > >>> > > >
> > > >>> > > > --
> > > >>> > > > Jianshi Huang
> > > >>> > > >
> > > >>> > > > LinkedIn: jianshi
> > > >>> > > > Twitter: @jshuang
> > > >>> > > > Github & Blog: http://huangjs.github.com/
> > > >>> > > >
> > > >>> > >
> > > >>> >
> > > >>> >
> > > >>> >
> > > >>> > --
> > > >>> > Jianshi Huang
> > > >>> >
> > > >>> > LinkedIn: jianshi
> > > >>> > Twitter: @jshuang
> > > >>> > Github & Blog: http://huangjs.github.com/
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Jianshi Huang
> > > >>
> > > >> LinkedIn: jianshi
> > > >> Twitter: @jshuang
> > > >> Github & Blog: http://huangjs.github.com/
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Jianshi Huang
> > > >
> > > > LinkedIn: jianshi
> > > > Twitter: @jshuang
> > > > Github & Blog: http://huangjs.github.com/
> > > >
> > >
> > >
> > >
> > > --
> > > Jianshi Huang
> > >
> > > LinkedIn: jianshi
> > > Twitter: @jshuang
> > > Github & Blog: http://huangjs.github.com/
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message