hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Where is HBase failed servers list stored
Date Wed, 04 Mar 2015 04:55:37 GMT
Please see HBASE-13067 Fix caching of stubs to allow IP address changes of
restarted remote servers

Cheers

On Tue, Mar 3, 2015 at 8:26 PM, Sandeep L <sandeepvreddy@outlook.com> wrote:

> Hi nkeywal,
> While trying to get more details about this issue I got to know that
> HMaster is trying to connect to wrong IP Address.
> Here is exact issue:
> Due to some unavoidable reason we are forced to change IP Address of
> regionsserver & then updated new IP Address in /etc/hosts file across all
> HBase servers. I started RegionServer from master with start-hbase.sh
> scripts & jps output in regionserver shows it's(regionserver process) up
> and running.
> But when running hbase balancer HMaster is trying to connect to old IP
> Address instead of new IP Address.
> One more thing here is when I checked regionserver status on 60010 port
> its showing as up and running.
> Thanks,Sandeep.
>
> > From: nkeywal@gmail.com
> > Date: Tue, 3 Mar 2015 19:01:01 +0100
> > Subject: Re: Where is HBase failed servers list stored
> > To: user@hbase.apache.org
> >
> > It's in local memory. When HBase cannot connect to a server, it puts it
> > into the "failedServerList" for 2 seconds. This is to avoid having all
> the
> > threads going into a potentially long socket timeout. Are you sure that
> you
> > can connect from the master to this machine/port?
> >
> > You can change the time it stays in the list with
> > hbase.ipc.client.failed.servers.expiry (in milliseconds), but it should
> not
> > help.
> >
> > You should have another exception before this one in the logs (the one
> that
> > initially put this region server in this failedServerList).
> >
> > On Tue, Mar 3, 2015 at 12:08 PM, Sandeep L <sandeepvreddy@outlook.com>
> > wrote:
> >
> > > Hi,
> > > While trying to run hbase balancer I am getting error message as "This
> > > server is in the failed servers list".Due to this cluster is not
> getting
> > > balanced.
> > > Even though regionserver is up and running hmaster is unable to
> connect to
> > > it.
> > > The odd thing here is hmaster is able to start regionserver and it is
> > > detected as up and running but unable to assign regions.
> > > Can some one suggest any solution for this.
> > > Following is full stack
> > > trace:org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: This
> > > server is in the failed servers list: host1/192.168.2.20:60020  at
> > >
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:853)
> > > at
> > >
> org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
> > >  at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
>   at
> > >
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
> > >       at
> > >
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
> > >      at
> > >
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
> > > at
> > >
> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
> > > at
> > >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
> > > at
> > >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
> > > at
> > >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
> > > at
> > >
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
> > >    at
> > >
> org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
> > >   at
> > >
> org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
> > > at
> > >
> org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
> > > at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> > >     at java.util.concurrent.FutureTask.run(FutureTask.java:262)     at
> > >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > >     at
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > >     at java.lang.Thread.run(Thread.java:745)
> > > Thanks,Sandeep.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message