hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-5067) HMaster uses wrong name for address (in stand-alone mode)
Date Sat, 11 Apr 2015 01:02:12 GMT

     [ https://issues.apache.org/jira/browse/HBASE-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell resolved HBASE-5067.
-----------------------------------
    Resolution: Not A Problem

> HMaster uses wrong name for address (in stand-alone mode)
> ---------------------------------------------------------
>
>                 Key: HBASE-5067
>                 URL: https://issues.apache.org/jira/browse/HBASE-5067
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Eran Hirsch
>
> In STANDALONE mode:
> When setting the configuration option "hbase.master.dns.interface" (and optional "hbase.master.dns.nameserver")
to non-default values,
> it is EXPECTED that the master node would report its fully qualified dns name when registering
in ZooKeeper,
> BUT INSTEAD, the machines hostname is taken instead.
> For example, my machine is called (aka "its hostname is...") "machine1" but it's name
in the network is "machine1.our-dev-network.my-corp.com", so to find this machine's IP anywhere
on the network i would need to query for the whole name (because trying to find "machine1"
is ambiguous on a network).
> Why is this a bug, because when trying to connect to this stand-alone hbase installation
from outside the machine it is running on, when querying ZK for /hbase/master we get only
the "machine1" part, and then fail with an unresolvable address for the master (which later
even gives a null pointer because of a missing null check).
> This is the stack trace when calling HTable's c'tor:
> java.lang.IllegalArgumentException: hostname can't be null
> 	at java.net.InetSocketAddress.<init>(InetSocketAddress.java:139) ~[na:1.7.0_02]
> 	at org.apache.hadoop.hbase.HServerAddress.getResolvedAddress(HServerAddress.java:108)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:64) ~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.dataToHServerAddress(RootRegionTracker.java:82)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:73)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:579)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:594)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:173) ~[hbase-0.90.4.jar:0.90.4]
> 	at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:147) ~[hbase-0.90.4.jar:0.90.4]
> ==============
> Why this happens?
> 1. When building the HMaster object we correctly use the static 'getMyAddress(conf)'
to read the configuration options, and then to try and resolve the machine's ip. This method
returns the full qualified name correctly, and this is then used to construct an 'HServerAddress'
object which is locally stored as 'a'.
> 2. So far so good, but now, instead of using this object as the value for the master's
'address' field the code goes on to initialize the 'rpcServer' field. As part of this calls
the static 'HBaseRPC.getServer' method is called with, among others, the HServerAddress's
BIND ADDRESS (aka the IP) that we have just built.
> 3. But now, when we finally get to setting the value for HMaster's 'address' field, we
initialize a NEW HServerAddress initialized with rpcServer.getListenerAddress() (which is
basically the IP we just gave it, with a new listening port.
> 4. HServerAddress calls 'getAddress().getHostName()' on this address object, which would
return the local hostname of the machine, because the IP would be resolved locally by the
machine, and not using a nameserver.
> So eventually, the fully qualified name computed in step 1 is NOT USED in any way, instead,
all further processing is done on the IP address of the host (and its local resolving to the
hostname).
> =======
> What should happen?
> The 'HMaster.address' field should be set to an address which is made of the fully qualified
name retrieved in step 1, combined with the port retrieved from the rpcServer computed at
step 2.
> ====
> Notes:
> 1. It seems that the 'HBaseServer' c'tor (which is called when 'HBaseRPC.getServer()'
static method is called) is faulty as it doesn't use the port number sent to it in effect
(it sets the local 'port' field to it, but then overrides it without ever reading it later
on, with the port returned from the new 'Listener' object. This might be a bug, but i have
not checked it enough.
> 2. The same bug with the master node could repeat itself in the region server code, but
i haven't checked that at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message