hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine.
Date Thu, 20 Jun 2013 05:12:21 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688877#comment-13688877

stack commented on HBASE-8667:

You remove this:

-      NameStringPair.Builder entry = NameStringPair.newBuilder()
-        .setName(HConstants.KEY_FOR_HOSTNAME_SEEN_BY_MASTER)
-        .setValue(rs.getHostname());

Implication is that the master and regionserver will never disagree on the RS name?  Is that
so?  Master just takes the name the RS proffers?  I do not see any resolve going on in here
(no InetAddress construction) so it maybe possible that there is no DNS in here to mess us

This should be ok:

+      this.serverNameFromMasterPOV = new ServerName(this.isa.getHostName(), this.isa.getPort(),
+          this.startcode);

This is over on the RS.  And it is telling the master what name to use, the one it found when
it did a resolve.

We should change the name of this variable then: serverNameFromMasterPOV

This is a change in how we do server naming but it looks safe and solves a few issues we have
had w/ a while.

Anyone else want to take a look here?

> Master and Regionserver not able to communicate if both bound to different network interfaces
on the same machine.
> ------------------------------------------------------------------------------------------------------------------
>                 Key: HBASE-8667
>                 URL: https://issues.apache.org/jira/browse/HBASE-8667
>             Project: HBase
>          Issue Type: Bug
>          Components: IPC/RPC
>            Reporter: rajeshbabu
>             Fix For: 0.98.0, 0.95.2, 0.94.9
>         Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch
> While testing HBASE-8640 fix found that master and regionserver running on different
interfaces are not communicating properly.
> I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo.
> I have configured master ipc address to ip of eth0 interface.
> Started master and regionserver on the same machine.
> 1) master rpc server bound to eth0 and RS rpc server bound to lo
> 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup
its getting registered with eth0 ip address(but actually it should register localhost)
> Here are RS logs:
> {code}
> 2013-05-31 06:05:28,608 WARN  [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer:
reportForDuty failed; sleeping and then retrying.
> 2013-05-31 06:05:31,609 INFO  [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer:
Attempting connect to Master server at,60000,1369960497008
> 2013-05-31 06:05:31,609 INFO  [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer:
Telling master at,60000,1369960497008 that we are up with port=60020, startcode=1369960502544
> 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer:
Config from master: hbase.rootdir=hdfs://localhost:2851/hbase
> 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer:
Config from master: fs.default.name=hdfs://localhost:2851
> 2013-05-31 06:05:31,618 INFO  [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer:
Master passed us a different hostname to use; was=localhost, but now=
> {code}
> Here are master logs:
> {code}
> 2013-05-31 06:05:31,615 INFO  [IPC Server handler 9 on 60000] org.apache.hadoop.hbase.master.ServerManager:
Registering server=,60020,1369960502544
> {code}
> Since master has wrong rpc server address of RS, META is not getting assigned.
> {code}
> 2013-05-31 06:05:34,362 DEBUG [master-,60000,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager:
No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192
so generated a random one; hri=.META.,,1.1028785192, src=, dest=,60020,1369960502544;
1 (online=1, available=1) available servers, forceNewPlan=false
> -----
> org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192
to,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
> 	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813)
> 	at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422)
> 	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315)
> 	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587)
> 	at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039)
> 	at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1432)
> 	at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.addToRITandCallClose(AssignmentManager.java:699)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:584)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransition(AssignmentManager.java:517)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransitionAndBlockUntilAssigned(AssignmentManager.java:473)
> 	at org.apache.hadoop.hbase.master.HMaster.assignMeta(HMaster.java:917)
> 	at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:803)
> 	at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:547)
> 	at java.lang.Thread.run(Thread.java:636)
> {code}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message