hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramkrishna S Vasudevan <ramakrish...@huawei.com>
Subject RE: Reg: HRegionServer not able to communicate with the new HMaster when HMaster switching happens
Date Tue, 05 Apr 2011 06:10:34 GMT
Hi 

No it does not recover, regionserver is not able to get the new hmaster
adress.  I am using hbase0.90.0 version.
We were able to identify the problem also,
The api getMaster() in HRegionServer has a while loop where the region
server tries to connect to the HMaster
while ((masterAddress = masterAddressManager.getMasterAddress()) == null) { 
      if (stopped) { 
        return null; 
      } 
      LOG.debug("No master found, will retry"); 
      sleeper.sleep(); 
    } 
    HMasterRegionInterface master = null; 
    while (!stopped && master == null) { 
      try { 
        // Do initial RPC setup. The final argument indicates that the RPC 
        // should retry indefinitely. 
        master = (HMasterRegionInterface) HBaseRPC.waitForProxy( 
            HMasterRegionInterface.class, HBaseRPCProtocolVersion.versionID,

            masterAddress.getInetSocketAddress(), this.conf, -1, 
            this.rpcTimeout, this.rpcTimeout); 
      } catch (IOException e) { 
        e = e instanceof RemoteException ? 
            ((RemoteException)e).unwrapRemoteException() : e; 
        if (e instanceof ServerNotRunningException) { 
          LOG.info("Master isn't available yet, retrying"); 
        } else { 
          LOG.warn("Unable to connect to master. Retrying. Error was:", e); 
        } 
        sleeper.sleep(); 
      }

The masterAddress is fetched only when the master is obtained for the first
time.  Later the HRegion moves in to the second while loop when the HMaster
goes down.  In the 2nd while loop the RegionServer tries to get the
HMaster's new address(the switched one) which will still be the old one as
the masterAddressManager is only updated.

So as a fix to this problem we can get the update master address from the
masterAddressManager as follows:

masterAddress = masterAddressManager.getMasterAddress(); 
                      master = (HMasterRegionInterface)
HBaseRPC.waitForProxy( 
                    HMasterRegionInterface.class,
HBaseRPCProtocolVersion.versionID, 
                    masterAddress.getInetSocketAddress(), this.conf, -1, 
                    this.rpcTimeout, this.rpcTimeout);

This may resolve the issue.  


Regards
Ram

****************************************************************************
***********
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Tuesday, April 05, 2011 11:05 AM
To: dev@hbase.apache.org; ramakrishnas@huawei.com
Subject: Re: Reg: HRegionServer not able to communicate with the new HMaster
when HMaster switching happens

RegionServer should get notification of new master.  Do you not see
that in the regionserver logs?  Does it never recover?  What version
of hbase?
Thanks,
St.Ack

On Mon, Apr 4, 2011 at 9:57 PM, Ramkrishna S Vasudevan
<ramakrishnas@huawei.com> wrote:
> Hi
>
>
>
> When HBase is running in HA mode, the RegionServer is connected to the
> Active HMaster.
>
> When a switch over happens then the RegionServer is not able to connect to
> the new Active HMaster.
>
>
>
> Conneciton refused exception is thrown.
>
>
>
> Is this a bug? If so if there is any Bug already raise for the same.
>
>
>
> Regards
>
> Ram
>
>
>
>
****************************************************************************
> ***********
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender
by
> phone or email immediately and delete it!
>
>
>
>


Mime
View raw message