hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3431) Regionserver is not using the name given it by the master; double entry in master listing of servers
Date Sat, 05 Feb 2011 00:37:30 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990852#comment-12990852
] 

stack commented on HBASE-3431:
------------------------------

If master can't find regionserver address, then master does this:

{code}
Caused by: java.lang.IllegalArgumentException: Could not resolve the DNS name of sv2borg185:60020
    at org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105)
    at org.apache.hadoop.hbase.HServerAddress.readFields(HServerAddress.java:168)
    at org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:230)
    at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:521)
    ... 8 more
{code}

... which is kinda dumb but means no progress unless server can get an address.

If DNS is wrong, e.g. on master, when it does a lookup on passed name, we come up w/ a different
address, then we'll tell the regionserver go forward with the IP.

At moment you'll see two entries for this badly configured server.  The regionserver will
show by its name and by its bad IP.

Symptom is you can't shutdown because master is waiting on the ghost server to finish its
close up (this is what was happening for mr oracle.com).

I manufactured Ted's prob. by changing hosts on master to have different subnet for a server.
 Then I got this in RS log:

{code}
2011-02-05 00:33:49,409 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed
us address to use. Was=sv2borg185:60020, Now=10.20.20.185:60020
{code}

Let me dig in.



> Regionserver is not using the name given it by the master; double entry in master listing
of servers
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3431
>                 URL: https://issues.apache.org/jira/browse/HBASE-3431
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.1
>
>         Attachments: 3431.txt
>
>
> Our man Ted Dunning found the following where RS checks in with one name, the master
tells it use another name but we seem to go ahead and continue with our original name.
> In RS logs I see:
> {code}
> 2011-01-07 15:45:50,757 INFO  org.apache.hadoop.hbase.regionserver.HRegionServer [regionserver60020]:
Master passed us address to use. Was=perfnode11:60020, Now=10.10.30.11:60020
> {code}
> On master I see
> {code}
> 2011-01-07 15:45:38,613 INFO  org.apache.hadoop.hbase.master.ServerManager [IPC Server
handler 0 on 60000]: Registering server=10.10.30.11,60020,1294443935414, regionCount=0, userLoad=false
> {code}
> ....
> then later
> {code}
> 2011-01-07 15:45:44,247 INFO  org.apache.hadoop.hbase.master.ServerManager [IPC Server
handler 2 on 60000]: Registering server=perfnode11,60020,1294443935414, regionCount=0, userLoad=true
> {code}
> This might be since we started letting servers register in other than with the reportStartup.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message