hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Heng Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14182) My regionserver change ip. But hmaster still connect to old ip after the rs restart
Date Tue, 04 Aug 2015 14:47:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653748#comment-14653748
] 

Heng Chen commented on HBASE-14182:
-----------------------------------

It seems has a better solution.  As JDK docs said 
{quote}
InetAddress Caching
The InetAddress class has a cache to store successful as well as unsuccessful host name resolutions.
By default, when a security manager is installed, in order to protect against DNS spoofing
attacks, the result of positive host name resolutions are cached forever. When a security
manager is not installed, the default behavior is to cache entries for a finite (implementation
dependent) period of time. The result of unsuccessful host name resolution is cached for a
very short period of time (10 seconds) to improve performance.

If the default behavior is not desired, then a Java security property can be set to a different
Time-to-live (TTL) value for positive caching. Likewise, a system admin can configure a different
negative caching TTL value when needed.

Two Java security properties control the TTL values used for positive and negative host name
resolution caching:

networkaddress.cache.ttl
Indicates the caching policy for successful name lookups from the name service. The value
is specified as as integer to indicate the number of seconds to cache the successful lookup.
The default setting is to cache for an implementation specific period of time.
A value of -1 indicates "cache forever".

networkaddress.cache.negative.ttl (default: 10)
Indicates the caching policy for un-successful name lookups from the name service. The value
is specified as as integer to indicate the number of seconds to cache the failure for un-successful
lookups.
A value of 0 indicates "never cache". A value of -1 indicates "cache forever".
{quote}

We can set networkaddress.cache.ttl to be a limit time. 

> My regionserver change ip. But hmaster still connect to old ip after the rs restart
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-14182
>                 URL: https://issues.apache.org/jira/browse/HBASE-14182
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.98.6
>            Reporter: Heng Chen
>
> I use docker to deploy my hbase cluster, and the RS ip changed. When restart this RS,
 hmaster webUI shows it connect to hmaster, but regions num. is zero after a long time. I
check the hmaster log and found that master still use old ip to connect this rs.
> This is hmaster's log below:
> PS: 10.11.21.140 is old ip of  rs dx-ape-regionserver1-online
> {code}
> 2015-08-04 17:24:00,081 INFO  [AM.ZK.Worker-pool2-t14141] master.AssignmentManager: Assigning
solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to dx-ape-regionserver1-online,60020,1438679950072
> 2015-08-04 17:24:06,800 WARN  [AM.ZK.Worker-pool2-t14133] master.AssignmentManager: Failed
assignment of solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020.
to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=3
of 10
> java.net.ConnectException: Connection timed out
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>         at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
>         at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868)
>         at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
>         at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
>         at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
>         at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
>         at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
>         at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
>         at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
>         at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
>         at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-08-04 17:24:06,801 WARN  [AM.ZK.Worker-pool2-t14140] master.AssignmentManager: Failed
assignment of solar_image,\x00(.\xE7\xB1L,1430024620929.534025fcf4cae5516513b9c9a4cf73dc.
to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=2
of 10
> java.net.ConnectException: Call to dx-ape-regionserver1-online/10.11.21.140:60020 failed
on connection exception: java.net.ConnectException: Connection timed out
>         at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1483)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1461)
>         at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
>         at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
>         at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964)
>         at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550)
>         at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104)
>         at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999)
>         at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447)
>         at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.ConnectException: Connection timed out
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>         at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578)
>         at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868)
>         at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
>         ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message