hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12072) Standardize retry handling for master operations
Date Thu, 05 Nov 2015 19:27:28 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992306#comment-14992306
] 

Enis Soztutar commented on HBASE-12072:
---------------------------------------

[~appy] can you please not add the fixVersions=1.0.0 to jiras with fix versions = 0.99.x.
 

The fixVersions tracks the first release that the issue appeared in. Since we have done 0.99.x
releases before 1.0.0, some of the jiras that you recently modified appeared on earlier 0.99.x
releases. 

Let me know whether this helps: 
http://markmail.org/message/u43qluenc7soxloe

> Standardize retry handling for master operations
> ------------------------------------------------
>
>                 Key: HBASE-12072
>                 URL: https://issues.apache.org/jira/browse/HBASE-12072
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.6
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 1.0.0, 2.0.0, 0.99.2
>
>         Attachments: 12072-v1.txt, 12072-v2.txt, hbase-12072_v1.patch, hbase-12072_v2.patch,
hbase-12072_v2.patch, hbase-12072_v3.patch
>
>
> For master requests, there are two retry mechanisms in effect. The first one is from
HBaseAdmin.executeCallable() 
> {code}
>   private <V> V executeCallable(MasterCallable<V> callable) throws IOException
{
>     RpcRetryingCaller<V> caller = rpcCallerFactory.newCaller();
>     try {
>       return caller.callWithRetries(callable);
>     } finally {
>       callable.close();
>     }
>   }
> {code}
> And inside, the other one is from StubMaker.makeStub():
> {code}
> /**
>        * Create a stub against the master.  Retry if necessary.
>        * @return A stub to do <code>intf</code> against the master
>        * @throws MasterNotRunningException
>        */
>       @edu.umd.cs.findbugs.annotations.SuppressWarnings (value="SWL_SLEEP_WITH_LOCK_HELD")
>       Object makeStub() throws MasterNotRunningException {
> {code}
> The tests will just hang for 10 min * 35 ~= 6hours. 
> {code}
> 2014-09-23 16:19:05,151 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 1 of 35 failed; retrying after sleep of 100, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:05,253 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 2 of 35 failed; retrying after sleep of 200, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:05,456 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 3 of 35 failed; retrying after sleep of 300, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:05,759 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 4 of 35 failed; retrying after sleep of 500, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:06,262 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 5 of 35 failed; retrying after sleep of 1008, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:07,273 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 6 of 35 failed; retrying after sleep of 2011, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:09,286 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 7 of 35 failed; retrying after sleep of 4012, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:13,303 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 8 of 35 failed; retrying after sleep of 10033, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:23,343 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 9 of 35 failed; retrying after sleep of 10089, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:33,439 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 10 of 35 failed; retrying after sleep of 10027, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:43,473 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 11 of 35 failed; retrying after sleep of 10004, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:19:53,485 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 12 of 35 failed; retrying after sleep of 20160, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:20:13,656 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 13 of 35 failed; retrying after sleep of 20006, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:20:33,675 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 14 of 35 failed; retrying after sleep of 20076, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:20:53,762 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 15 of 35 failed; retrying after sleep of 20077, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:21:13,852 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 16 of 35 failed; retrying after sleep of 20103, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:21:33,967 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 17 of 35 failed; retrying after sleep of 20136, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:21:54,115 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 18 of 35 failed; retrying after sleep of 20147, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:22:14,274 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 19 of 35 failed; retrying after sleep of 20131, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:22:34,417 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 20 of 35 failed; retrying after sleep of 20171, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:22:54,601 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 21 of 35 failed; retrying after sleep of 20177, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:23:14,790 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 22 of 35 failed; retrying after sleep of 20193, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:23:34,996 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 23 of 35 failed; retrying after sleep of 20195, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:23:55,203 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 24 of 35 failed; retrying after sleep of 20107, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:24:15,322 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 25 of 35 failed; retrying after sleep of 20186, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:24:35,520 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 26 of 35 failed; retrying after sleep of 20106, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:24:55,638 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 27 of 35 failed; retrying after sleep of 20173, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:25:15,824 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 28 of 35 failed; retrying after sleep of 20136, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:25:35,973 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 29 of 35 failed; retrying after sleep of 20188, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:25:56,174 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 30 of 35 failed; retrying after sleep of 20144, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:26:16,330 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 31 of 35 failed; retrying after sleep of 20106, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:26:36,448 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 32 of 35 failed; retrying after sleep of 20003, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:26:56,463 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 33 of 35 failed; retrying after sleep of 20114, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:16,590 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 34 of 35 failed; retrying after sleep of 20154, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:36,756 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 35 of 35 failed; no more retrying.
> java.io.IOException: Can't get master address from ZooKeeper; znode data == null
> 	at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:114)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1554)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1599)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1653)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1860)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin$MasterCallable.prepare(HBaseAdmin.java:3359)
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:122)
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:92)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3386)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2201)
> 	at org.apache.hadoop.hbase.DistributedHBaseCluster.getClusterStatus(DistributedHBaseCluster.java:74)
> 	at org.apache.hadoop.hbase.DistributedHBaseCluster.<init>(DistributedHBaseCluster.java:57)
> 	at org.apache.hadoop.hbase.IntegrationTestingUtility.createDistributedHBaseCluster(IntegrationTestingUtility.java:140)
> 	at org.apache.hadoop.hbase.IntegrationTestingUtility.initializeCluster(IntegrationTestingUtility.java:75)
> 	at org.apache.hadoop.hbase.IntegrationTestManyRegions.setUp(IntegrationTestManyRegions.java:80)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 	at org.junit.runners.Suite.runChild(Suite.java:127)
> 	at org.junit.runners.Suite.runChild(Suite.java:26)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 	at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
> 	at org.junit.runner.JUnitCore.run(JUnitCore.java:138)
> 	at org.junit.runner.JUnitCore.run(JUnitCore.java:117)
> 	at org.apache.hadoop.hbase.IntegrationTestsDriver.doWork(IntegrationTestsDriver.java:110)
> 	at org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:112)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> 	at org.apache.hadoop.hbase.IntegrationTestsDriver.main(IntegrationTestsDriver.java:46)
> 2014-09-23 16:27:37,061 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 1 of 35 failed; retrying after sleep of 100, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:37,163 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 2 of 35 failed; retrying after sleep of 200, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:37,365 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 3 of 35 failed; retrying after sleep of 301, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:37,669 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 4 of 35 failed; retrying after sleep of 504, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:38,176 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 5 of 35 failed; retrying after sleep of 1008, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:39,185 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 6 of 35 failed; retrying after sleep of 2018, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:41,207 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 7 of 35 failed; retrying after sleep of 4019, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:45,231 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 8 of 35 failed; retrying after sleep of 10004, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:27:55,241 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 9 of 35 failed; retrying after sleep of 10005, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:28:05,253 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 10 of 35 failed; retrying after sleep of 10099, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:28:15,359 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 11 of 35 failed; retrying after sleep of 10059, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:28:25,425 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 12 of 35 failed; retrying after sleep of 20069, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:28:45,507 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 13 of 35 failed; retrying after sleep of 20006, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:29:05,525 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 14 of 35 failed; retrying after sleep of 20186, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:29:25,723 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 15 of 35 failed; retrying after sleep of 20080, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:29:45,814 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 16 of 35 failed; retrying after sleep of 20001, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:30:05,826 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 17 of 35 failed; retrying after sleep of 20019, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:30:25,857 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 18 of 35 failed; retrying after sleep of 20159, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:30:46,028 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 19 of 35 failed; retrying after sleep of 20170, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:31:06,211 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 20 of 35 failed; retrying after sleep of 20146, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:31:26,368 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 21 of 35 failed; retrying after sleep of 20138, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:31:46,518 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 22 of 35 failed; retrying after sleep of 20140, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:32:06,670 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 23 of 35 failed; retrying after sleep of 20196, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:32:26,878 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 24 of 35 failed; retrying after sleep of 20123, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> 2014-09-23 16:32:47,013 INFO  [main] client.ConnectionManager$HConnectionImplementation:
getMaster attempt 25 of 35 failed; retrying after sleep of 20033, exception=java.io.IOException:
Can't get master address from ZooKeeper; znode data == null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message