hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yong Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14958) regionserver.HRegionServer: Master passed us a different hostname to use; was=n04docker2, but now=192.168.3.114
Date Thu, 10 Dec 2015 00:51:10 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049753#comment-15049753
] 

Yong Zheng commented on HBASE-14958:
------------------------------------

Thanks for Nick so prompt response. 

After checking the prerequisites, DNS can't solve the issue. 

in my virtualized hbase cluster, it has only 4 nodes: 
n03docker1(172.17.1.2)
n03docker2(172.17.1.3)

n04docker1(172.17.2.2)
n04docker2(172.17.2.3)

DNS is not configured but I configured /etc/hosts:
# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

172.17.1.1   c3m3n03docker.gpfs.net c3m3n03docker            <== the br0 on the physical
node c3m3n03
172.17.2.1   c3m3n04docker.gpfs.net c3m3n04docker             <== the br0 on the physical
node c3m3n04

172.17.1.2   n03docker1.gpfs.net n03docker1
172.17.1.3   n03docker2.gpfs.net n03docker2
172.17.2.2   n04docker1.gpfs.net n04docker1
172.17.2.3   n04docker2.gpfs.net n04docker2

so, DNS resolution works(I do see the correct name for n03docker1 and n03docker2). However,
for any region servers located over other physical machines, all network packet from those
region servers  will be source NATed with the IP of c3m3n04(192.168.3.114)(that means, all
IP packet will be changed with the source IP as 192.168.3.114. so that these packets can be
transferred to the physical node c3m3n03).

for hbase master, 192.168.3.113 or 192.168.3.114 are invisible for hbase. thus, DNS resolution
for 192.168.3.114 inside VM doesn't help this.  e.g. 192.168.3.114's hostname should be c3m3n04,
not n04docker1 or n04docker2.
if we configure DNS inside VM to map 192.168.3.114 into n04docker1 or n04docker2, this will
mess up IP-hostname inside VM. Also, if we map 192.168.3.114 into n04docker1, that means,
we can't start the 2nd region server over the same physical node because they will be recognized
as the physical node's IP address/hostname.

> regionserver.HRegionServer: Master passed us a different hostname to use; was=n04docker2,
but now=192.168.3.114
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14958
>                 URL: https://issues.apache.org/jira/browse/HBASE-14958
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.1.2
>         Environment: physical machines: redhat7.1
> docker version: 1.9.1
>            Reporter: Yong Zheng
>
> I have two physical machines: c3m3n03docker and c3m3n04docker.
> I started two docker instances per physical node. the topology is like:
> n03docker1(172.17.1.2)  -\
>                                           | br0(172.17.1.1)  +  c3m3n03
> n03docker2(172.17.1.3) -/
> n04docker1(172.17.2.2)  -\
>                                           | br0(172.17.2.1)  +  c3m3n04
> n04docker2(172.17.2.3) -/
> for physical machines, c3m3n03 is bundled with physical adapter enp11s0f0 with IP (192.168.3.113/16);
c3m3n04 is bundled with physical adapter enp11s0f0 with IP(192.168.3.114/16). these two physical
adapters are connecting to the same switch.
> Note: br0 is not bundled to physical adapter enp11s0f0  on both nodes. so, all requests
in 172.17.2.x will be source NAT as 192.168.3.114(c3m3n04) and forwarded to c3m3n03.
> n03docker1: hbase(1.1.2) master
> n03docker2: region server
> n04docker1: region server
> n04docker2: region server
> I first start the n03docker1 and n03docker2, it works; after that, I start n04docker2
and it will reported:
> 2015-12-09 08:01:58,259 ERROR [regionserver/n04docker2.gpfs.net/172.17.2.3:16020] regionserver.HRegionServer:
Master passed us a different hostname to use; was=n04docker2.gpfs.net, but now=192.168.3.114
> on the master logs:
> 2015-12-09 08:11:12,234 INFO  [PriorityRpcServer.handler=0,queue=0,port=16000] master.ServerManager:
Registering server=192.168.3.114,16020,1449666670721
> So, you see, when hbase master receives the requests from n04docker2, all these requests
are source NATed with 192.168.3.114(not 172.17.2.3).  and hbase master passes 192.168.3.114
back to 172.17.2.3(n04docker2). Thus, n04docker1(172.17.2.3) reported exceptions in logs.
> hbase doesn't support running in virtualization cluster? because SNAT is widely used
in virtualization. if hbase master get remote hostname/ip(thus get 192.168.3.114) and pass
it back to region server, it will hit this issues.
> HBASE-8667 doesn't fix this issue because the fix has been hbase 0.98(I'm taking hbase
1.1.2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message