hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Help: RegionTooBusyException: failed to get a lock in 60000 ms
Date Tue, 23 Sep 2014 03:52:33 GMT
Cycling previous bits (w.r.t. adjusting weights for table skew):
http://search-hadoop.com/m/DHED4CWSqW1/snapshot+timeout+problem&subj=Re+snapshot+timeout+problem

Cheers

On Mon, Sep 22, 2014 at 8:35 PM, Qiang Tian <tianq01@gmail.com> wrote:

> Hello, I happened to got balancer related issues 2 months ago and looked at
> that part, below is a summary:
> 1)by default, hbase balancer(StochasticLoadBalancer by default) does not
> balance regions per table. i.e. all regions are considered as 1 table.  so
> if you have many tables, especially some tables have empty regions, you
> probably get unbalanced, the balancer probably not triggered at all.
> this is got from code inspection, my problem failed to be reproduced later.
> but it proved that deleting empty regions can trigger balancer correctly
> and make regions well balanced.
>
> 2)there are some other reasons that balancer are not triggered. see
> HMaster#balance. turn on debug can see related messages in master log. in
> my case, it is not triggered because there are regions in transition:
> LOG.debug("Not running balancer because " + regionsInTransition.size() +
>           " region(s) in transition: " +
> org.apache.commons.lang.StringUtils.
>             abbreviate(regionsInTransition.toString(), 256));
>
> the cause can be found in regionserver log file.
>
> 3)per-table balance can be set by "hbase.master.loadbalance.bytable",
> however it looks not a good option when you have many tables - the master
> will issue balance call for each table, one by one.
>
> 4)split region follows normal balancer process. so if you have issue in #1,
> split does not help balance.  it looks pre-split at table creation is fine,
> which uses round-robin assignment.
>
>
>
> On Tue, Sep 23, 2014 at 2:12 AM, Bharath Vissapragada <
> bharathv@cloudera.com
> > wrote:
>
> > https://issues.apache.org/jira/browse/HBASE-11368 related to the
> original
> > issue too.
> >
> > On Mon, Sep 22, 2014 at 10:18 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > As you noted in the FIXME, there're some factors which should be
> tackled
> > by
> > > balancer / assignment manager.
> > >
> > > Please continue digging up master log so that we can find the cause for
> > > balancer not fulfilling your goal.
> > >
> > > Cheers
> > >
> > > On Mon, Sep 22, 2014 at 10:09 AM, Jianshi Huang <
> jianshi.huang@gmail.com
> > >
> > > wrote:
> > >
> > > > Ok, I fixed this by manually reassign region servers to newly created
> > > ones.
> > > >
> > > >   def reassignRegionServer(admin: HBaseAdmin, regions:
> > Seq[HRegionInfo],
> > > > regionServers: Seq[ServerName]): Unit = {
> > > >     val rand = new Random()
> > > >     regions.foreach { r =>
> > > >       val idx = rand.nextInt(regionServers.size)
> > > >       val server = regionServers(idx)
> > > >       // FIXME: what if selected region server is dead?
> > > >         admin.move(r.getEncodedNameAsBytes,
> > > > server.getServerName.getBytes("UTF8"))
> > > >     }
> > > >   }
> > > >
> > > > er...
> > > >
> > > > Jianshi
> > > >
> > > > On Tue, Sep 23, 2014 at 12:24 AM, Jianshi Huang <
> > jianshi.huang@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hmm...any workaround? I only want to do this:
> > > > >
> > > > > Rebalance the new regions *evenly* to all servers after manually
> > adding
> > > > > splits, so later bulk insertions won't cause contention.
> > > > >
> > > > > P.S.
> > > > > Looks like two of the region servers which had majority of the
> > regions
> > > > > were down during Major compaction... I guess it had too much data.
> > > > >
> > > > >
> > > > > Jianshi
> > > > >
> > > > > On Tue, Sep 23, 2014 at 12:13 AM, Jianshi Huang <
> > > jianshi.huang@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Yes, I have access to Master UI, however logs/*.log cannot be
> opened
> > > or
> > > > >> downloaded, must be some security restrictions in the proxy...
> > > > >>
> > > > >> Jianshi
> > > > >>
> > > > >> On Tue, Sep 23, 2014 at 12:06 AM, Ted Yu <yuzhihong@gmail.com>
> > wrote:
> > > > >>
> > > > >>> Do you have access to Master UI ?
> > > > >>>
> > > > >>> <master-address>:60010/logs/ would show you list of
log files.
> > > > >>>
> > > > >>> The you can view
> > > > <master-address>:60010/logs/hbase-<user>-master-XXX.log
> > > > >>>
> > > > >>> Cheers
> > > > >>>
> > > > >>> On Mon, Sep 22, 2014 at 9:00 AM, Jianshi Huang <
> > > > jianshi.huang@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>> > Ah... I don't have access to HMaster logs... I need
to ask the
> > > admin.
> > > > >>> >
> > > > >>> > Jianshi
> > > > >>> >
> > > > >>> > On Mon, Sep 22, 2014 at 11:49 PM, Ted Yu <yuzhihong@gmail.com>
> > > > wrote:
> > > > >>> >
> > > > >>> > > bq. assign per-table balancer class
> > > > >>> > >
> > > > >>> > > No that I know of.
> > > > >>> > > Can you pastebin master log involving output from
balancer ?
> > > > >>> > >
> > > > >>> > > Cheers
> > > > >>> > >
> > > > >>> > > On Mon, Sep 22, 2014 at 8:29 AM, Jianshi Huang
<
> > > > >>> jianshi.huang@gmail.com>
> > > > >>> > > wrote:
> > > > >>> > >
> > > > >>> > > > Hi Ted,
> > > > >>> > > >
> > > > >>> > > > I moved setBalancerRunning before balancer
and run them
> > twice.
> > > > >>> However
> > > > >>> > I
> > > > >>> > > > still got highly skewed region distribution.
> > > > >>> > > >
> > > > >>> > > > I guess it's because of the StochasticLoadBalancer,
can I
> > > assign
> > > > >>> > > per-table
> > > > >>> > > > balancer class in HBase?
> > > > >>> > > >
> > > > >>> > > >
> > > > >>> > > > Jianshi
> > > > >>> > > >
> > > > >>> > > > On Mon, Sep 22, 2014 at 9:50 PM, Ted Yu <
> yuzhihong@gmail.com
> > >
> > > > >>> wrote:
> > > > >>> > > >
> > > > >>> > > > > admin.setBalancerRunning() call should
precede the call
> to
> > > > >>> > > > > admin.balancer().
> > > > >>> > > > >
> > > > >>> > > > > You can inspect master log to see whether
regions are
> being
> > > > >>> moved off
> > > > >>> > > the
> > > > >>> > > > > heavily loaded server.
> > > > >>> > > > >
> > > > >>> > > > > Cheers
> > > > >>> > > > >
> > > > >>> > > > > On Mon, Sep 22, 2014 at 1:42 AM, Jianshi
Huang <
> > > > >>> > > jianshi.huang@gmail.com>
> > > > >>> > > > > wrote:
> > > > >>> > > > >
> > > > >>> > > > > > Hi Ted and others,
> > > > >>> > > > > >
> > > > >>> > > > > > I did the following after adding
splits (without data)
> to
> > > my
> > > > >>> table,
> > > > >>> > > > > however
> > > > >>> > > > > > the region is still very imbalanced
(one region server
> > has
> > > > 221
> > > > >>> > > regions
> > > > >>> > > > > and
> > > > >>> > > > > > other 50 region servers have about
4~8 regions each).
> > > > >>> > > > > >
> > > > >>> > > > > >       admin.balancer()
> > > > >>> > > > > >       admin.setBalancerRunning(true,
true)
> > > > >>> > > > > >
> > > > >>> > > > > > The balancer class in my HBase cluster
is
> > > > >>> > > > > >
> > > > >>> > > > > >
> > > > org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer
> > > > >>> > > > > >
> > > > >>> > > > > > So, is this behavior expected? Can
I assign different
> > > > balancer
> > > > >>> > class
> > > > >>> > > to
> > > > >>> > > > > my
> > > > >>> > > > > > tables (I don't have HBase admin
permission)? Which one
> > > > should
> > > > >>> I
> > > > >>> > use?
> > > > >>> > > > > >
> > > > >>> > > > > > I just want HBase to evenly distribute
the regions even
> > > they
> > > > >>> don't
> > > > >>> > > have
> > > > >>> > > > > > data (that's the purpose of pre-split
I think).
> > > > >>> > > > > >
> > > > >>> > > > > >
> > > > >>> > > > > > Jianshi
> > > > >>> > > > > >
> > > > >>> > > > > >
> > > > >>> > > > > > On Sat, Sep 6, 2014 at 12:45 AM,
Ted Yu <
> > > yuzhihong@gmail.com
> > > > >
> > > > >>> > wrote:
> > > > >>> > > > > >
> > > > >>> > > > > > > Yes. See the following method
in HBaseAdmin:
> > > > >>> > > > > > >
> > > > >>> > > > > > >   public boolean balancer()
> > > > >>> > > > > > >
> > > > >>> > > > > > >
> > > > >>> > > > > > > On Fri, Sep 5, 2014 at 9:38
AM, Jianshi Huang <
> > > > >>> > > > jianshi.huang@gmail.com
> > > > >>> > > > > >
> > > > >>> > > > > > > wrote:
> > > > >>> > > > > > >
> > > > >>> > > > > > > > Thanks Ted!
> > > > >>> > > > > > > >
> > > > >>> > > > > > > > Didn't know I still need
to run the 'balancer'
> > command.
> > > > >>> > > > > > > >
> > > > >>> > > > > > > > Is there a way to do it
programmatically?
> > > > >>> > > > > > > >
> > > > >>> > > > > > > > Jianshi
> > > > >>> > > > > > > >
> > > > >>> > > > > > > >
> > > > >>> > > > > > > >
> > > > >>> > > > > > > > On Sat, Sep 6, 2014 at
12:29 AM, Ted Yu <
> > > > >>> yuzhihong@gmail.com>
> > > > >>> > > > wrote:
> > > > >>> > > > > > > >
> > > > >>> > > > > > > > > After splitting the
region, you may need to run
> > > > balancer
> > > > >>> to
> > > > >>> > > > spread
> > > > >>> > > > > > the
> > > > >>> > > > > > > > new
> > > > >>> > > > > > > > > regions out.
> > > > >>> > > > > > > > >
> > > > >>> > > > > > > > > Cheers
> > > > >>> > > > > > > > >
> > > > >>> > > > > > > > >
> > > > >>> > > > > > > > > On Fri, Sep 5, 2014
at 9:25 AM, Jianshi Huang <
> > > > >>> > > > > > jianshi.huang@gmail.com
> > > > >>> > > > > > > >
> > > > >>> > > > > > > > > wrote:
> > > > >>> > > > > > > > >
> > > > >>> > > > > > > > > > Hi Shahab,
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > > I see, that
seems to be the right way...
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > > On Sat, Sep
6, 2014 at 12:21 AM, Shahab Yunus <
> > > > >>> > > > > > > shahab.yunus@gmail.com>
> > > > >>> > > > > > > > > > wrote:
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > > > Shahab
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > > --
> > > > >>> > > > > > > > > > Jianshi Huang
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > > > LinkedIn: jianshi
> > > > >>> > > > > > > > > > Twitter: @jshuang
> > > > >>> > > > > > > > > > Github &
Blog: http://huangjs.github.com/
> > > > >>> > > > > > > > > >
> > > > >>> > > > > > > > >
> > > > >>> > > > > > > >
> > > > >>> > > > > > > >
> > > > >>> > > > > > > >
> > > > >>> > > > > > > > --
> > > > >>> > > > > > > > Jianshi Huang
> > > > >>> > > > > > > >
> > > > >>> > > > > > > > LinkedIn: jianshi
> > > > >>> > > > > > > > Twitter: @jshuang
> > > > >>> > > > > > > > Github & Blog: http://huangjs.github.com/
> > > > >>> > > > > > > >
> > > > >>> > > > > > >
> > > > >>> > > > > >
> > > > >>> > > > > >
> > > > >>> > > > > >
> > > > >>> > > > > > --
> > > > >>> > > > > > Jianshi Huang
> > > > >>> > > > > >
> > > > >>> > > > > > LinkedIn: jianshi
> > > > >>> > > > > > Twitter: @jshuang
> > > > >>> > > > > > Github & Blog: http://huangjs.github.com/
> > > > >>> > > > > >
> > > > >>> > > > >
> > > > >>> > > >
> > > > >>> > > >
> > > > >>> > > >
> > > > >>> > > > --
> > > > >>> > > > Jianshi Huang
> > > > >>> > > >
> > > > >>> > > > LinkedIn: jianshi
> > > > >>> > > > Twitter: @jshuang
> > > > >>> > > > Github & Blog: http://huangjs.github.com/
> > > > >>> > > >
> > > > >>> > >
> > > > >>> >
> > > > >>> >
> > > > >>> >
> > > > >>> > --
> > > > >>> > Jianshi Huang
> > > > >>> >
> > > > >>> > LinkedIn: jianshi
> > > > >>> > Twitter: @jshuang
> > > > >>> > Github & Blog: http://huangjs.github.com/
> > > > >>> >
> > > > >>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Jianshi Huang
> > > > >>
> > > > >> LinkedIn: jianshi
> > > > >> Twitter: @jshuang
> > > > >> Github & Blog: http://huangjs.github.com/
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Jianshi Huang
> > > > >
> > > > > LinkedIn: jianshi
> > > > > Twitter: @jshuang
> > > > > Github & Blog: http://huangjs.github.com/
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Jianshi Huang
> > > >
> > > > LinkedIn: jianshi
> > > > Twitter: @jshuang
> > > > Github & Blog: http://huangjs.github.com/
> > > >
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message