hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yi Liang <white...@gmail.com>
Subject Re: Not running balancer because processing dead regionserver(s)
Date Tue, 22 Feb 2011 06:04:34 GMT
Yes, the server zcl crashed at that time.

But after I restarted it later, it's still in the dead server list.

2011-02-18 10:39:26,895 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=zcl.local,60020,1297996817352, regionCount=0,
userLoad=false
2011-02-18 10:39:35,062 DEBUG org.apache.hadoop.hbase.master.HMaster: Not
running balancer because processing dead regionserver(s):
[Docete.local,60020,1297919410096, liym.local,60020,1297919445796,
zcl.local,60020,1297919367472]

On Tue, Feb 22, 2011 at 1:48 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Looks like there was connectivity issue:
>
> java.net.NoRouteToHostException: No route to host
>
> On Sun, Feb 20, 2011 at 10:09 PM, Yi Liang <whitesky@gmail.com> wrote:
>
> > The related log is at: http://pastebin.com/0a1CjDUD
> >
> > It's ok now after restarting hbase, but still curious why it happend.
> >
> > Thanks,
> > Yi
> > On Sat, Feb 19, 2011 at 3:58 AM, Jean-Daniel Cryans <jdcryans@apache.org
> > >wrote:
> >
> > > The master should finish processing those dead servers at some point
> > > and it seems it's not happening? Unfortunately without the log nobody
> > > can'tell why. If you can post the complete log in pastebin or put it
> > > on a web server then we could take a look.
> > >
> > > J-D
> > >
> > > On Fri, Feb 18, 2011 at 12:39 AM, Yi Liang <whitesky@gmail.com> wrote:
> > > > Hi all,
> > > >
> > > > We have a hbase cluster with 10 region servers running HBase 0.90.0 +
> > > CDH3.
> > > > We're now importing big data into HBase.
> > > >
> > > > During the process, 2 servers crashed, but after restaring them,
> > they're
> > > no
> > > > longer assigned with any region, while regions on other servers keep
> > > > splitting when more data inserted.
> > > >
> > > > From the master log, we can see the periodical messages like:
> > > >
> > > > 2011-02-18 16:09:35,067 DEBUG org.apache.hadoop.hbase.master.HMaster:
> > Not
> > > > running balancer because processing dead regionserver(s):
> > > > [zcl.local,60020,1297996817352, qics.local,60020,1297919358488,
> > > > Docete.local,60020,1297919410096, liym.local,60020,1297919445796,
> > > > zcl.local,60020,1297919367472]
> > > >
> > > > zcl.local and qics.local are the machines we have restared, other 2
> > > machine
> > > > have kept running without restarting and are actually still serving
> > > regions.
> > > >
> > > > From the shell status:
> > > > 10 servers, 5 dead, 10.1000 average Load
> > > >
> > > > Why are there dead servers? And how to clear them so we could start
> > > > balancer?
> > > >
> > > > Thanks,
> > > > Yi
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message