Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 65ADBCBED for ; Wed, 3 Jul 2013 06:24:22 +0000 (UTC) Received: (qmail 49052 invoked by uid 500); 3 Jul 2013 06:24:22 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 49019 invoked by uid 500); 3 Jul 2013 06:24:22 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 49011 invoked by uid 99); 3 Jul 2013 06:24:22 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jul 2013 06:24:22 +0000 Date: Wed, 3 Jul 2013 06:24:22 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8667) Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13698665#comment-13698665 ] stack commented on HBASE-8667: ------------------------------ Patch looks great. Let me try on a cluster w/ broken reverse dns to make sure we don't regress but I like that this patch looks already to have removed the special casing of ubuntu install. Good on you Rajesh. > Master and Regionserver not able to communicate if both bound to different network interfaces on the same machine. > ------------------------------------------------------------------------------------------------------------------ > > Key: HBASE-8667 > URL: https://issues.apache.org/jira/browse/HBASE-8667 > Project: HBase > Issue Type: Bug > Components: IPC/RPC > Reporter: rajeshbabu > Assignee: rajeshbabu > Fix For: 0.98.0, 0.95.2, 0.94.10 > > Attachments: HBASE-8667_trunk.patch, HBASE-8667_Trunk.patch, HBASE-8667_Trunk-V2.patch, HBASE-8667_trunk_v4.patch, HBASE-8667_trunk_v5.patch, HBASE-8667_trunk_v6.patch > > > While testing HBASE-8640 fix found that master and regionserver running on different interfaces are not communicating properly. > I have two interfaces 1) lo 2) eth0 in my machine and default hostname interface is lo. > I have configured master ipc address to ip of eth0 interface. > Started master and regionserver on the same machine. > 1) master rpc server bound to eth0 and RS rpc server bound to lo > 2) Since rpc client is not binding to any ip address, when RS is reporting RS startup its getting registered with eth0 ip address(but actually it should register localhost) > Here are RS logs: > {code} > 2013-05-31 06:05:28,608 WARN [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying. > 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at 192.168.0.100,60000,1369960497008 > 2013-05-31 06:05:31,609 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 192.168.0.100,60000,1369960497008 that we are up with port=60020, startcode=1369960502544 > 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://localhost:2851/hbase > 2013-05-31 06:05:31,618 DEBUG [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://localhost:2851 > 2013-05-31 06:05:31,618 INFO [regionserver60020] org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=localhost, but now=192.168.0.100 > {code} > Here are master logs: > {code} > 2013-05-31 06:05:31,615 INFO [IPC Server handler 9 on 60000] org.apache.hadoop.hbase.master.ServerManager: Registering server=192.168.0.100,60020,1369960502544 > {code} > Since master has wrong rpc server address of RS, META is not getting assigned. > {code} > 2013-05-31 06:05:34,362 DEBUG [master-192.168.0.100,60000,1369960497008] org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=192.168.0.100,60020,1369960502544; 1 (online=1, available=1) available servers, forceNewPlan=false > ----- > org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of .META.,,1.1028785192 to 192.168.0.100,60020,1369960502544, trying to assign elsewhere instead; try=1 of 10 > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) > at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) > at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:549) > at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:813) > at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1422) > at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1315) > at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1532) > at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1587) > at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:15039) > at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:627) > at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1826) > at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1453) > at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1432) > at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104) > at org.apache.hadoop.hbase.master.AssignmentManager.addToRITandCallClose(AssignmentManager.java:699) > at org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:584) > at org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransition(AssignmentManager.java:517) > at org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransitionAndBlockUntilAssigned(AssignmentManager.java:473) > at org.apache.hadoop.hbase.master.HMaster.assignMeta(HMaster.java:917) > at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:803) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:547) > at java.lang.Thread.run(Thread.java:636) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira