hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12554) TestBaseLoadBalancer may timeout due to lengthy rack lookup
Date Fri, 21 Nov 2014 22:48:33 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221528#comment-14221528
] 

stack commented on HBASE-12554:
-------------------------------

+1

On commit, put public first in below and add class comment why we are putting in place this
mock class:

static public class...

Leave out the other changes, the changes to RackManager and to BaseLoadBalancer since you
don't know if they have an effect (logging that we spent 60 seconds in lookup could be good
... but you ain't sure the interrupt works).  Do such changes in another JIRA where you can
try code against bad dns to see it is doing the right thing.




> TestBaseLoadBalancer may timeout due to lengthy rack lookup
> -----------------------------------------------------------
>
>                 Key: HBASE-12554
>                 URL: https://issues.apache.org/jira/browse/HBASE-12554
>             Project: HBase
>          Issue Type: Test
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 12554-v1.txt, 12554-v2.txt, 12554-v3.txt, 12554-v4.txt
>
>
> Here is one of the recent occurrences (https://builds.apache.org/job/PreCommit-HBASE-Build/11778/console):
> {code}
> testImmediateAssignment(org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer)
 Time elapsed: 30.019 sec  <<< ERROR!
> java.lang.Exception: test timed out after 30000 milliseconds
> 	at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
> 	at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
> 	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
> 	at java.net.InetAddress.getAllByName0(InetAddress.java:1246)
> 	at java.net.InetAddress.getAllByName(InetAddress.java:1162)
> 	at java.net.InetAddress.getAllByName(InetAddress.java:1098)
> 	at java.net.InetAddress.getByName(InetAddress.java:1048)
> 	at org.apache.hadoop.net.NetUtils.normalizeHostName(NetUtils.java:561)
> 	at org.apache.hadoop.net.NetUtils.normalizeHostNames(NetUtils.java:578)
> 	at org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:109)
> 	at org.apache.hadoop.hbase.master.RackManager.getRack(RackManager.java:66)
> 	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.<init>(BaseLoadBalancer.java:273)
> 	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:1113)
> 	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.randomAssignment(BaseLoadBalancer.java:1175)
> 	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.immediateAssignment(BaseLoadBalancer.java:1145)
> 	at org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer.testImmediateAssignment(TestBaseLoadBalancer.java:136)
> {code}
> One possible fix is to submit CachedDNSToSwitchMapping.resolve() to executor pool for
execution. RackManager.getRack() can set a timeout beyond which UNKNOWN_RACK is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message