hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-8764) Some MasterMonitorCallable should retry
Date Fri, 26 Jul 2013 22:39:49 GMT

     [ https://issues.apache.org/jira/browse/HBASE-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-8764:
-------------------------

      Resolution: Fixed
    Release Note: 
Add retrying of Master operations; helps when running hbase-it and chaos monkey kills
Master or Master is not yet up ready to take on operations.

Refactors ServerCallable.  ServerCallable had a public call() method and then beside
it a withRetries() and also a withoutRetries().  Confusing.  Also the rpc retrying
with its specific handling of server exception returns was not reusable buried down
in ServerCallable guts.

This patch moves the rpc retrying code out of ServerCallable into a utility
RpcRetryingCaller class (A 'Caller' runs the 'Callable'). ServerCallable shrinks,
implements a new RetryingCallable Interface, and becomes RegionServerCallable, a class
that is just about Calling -- no rpc nor retries, a Callable class with added details
on where the Callable is to be applied (table name and row), -- etc.

This pattern is then applied to Master operations.  Master operations were not retried
previously.  The Master operation Callables are now like RegionServerCallable (though
they need to carry way less detail), implement RetryingCallable, and are passed to
RpcRetryingCaller so they are retried.  Changed some exceptions so they now implement
DoNotRetryException because not all master operations should be retried.
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Committed to trunk and 0.95.  Thanks for review E.
                
> Some MasterMonitorCallable should retry
> ---------------------------------------
>
>                 Key: HBASE-8764
>                 URL: https://issues.apache.org/jira/browse/HBASE-8764
>             Project: HBase
>          Issue Type: Bug
>          Components: IPC/RPC
>    Affects Versions: 0.95.1
>            Reporter: Elliott Clark
>            Assignee: stack
>             Fix For: 0.95.2
>
>         Attachments: 8764.txt, 8764v2.txt, 8764v3.txt, 8764v4.txt, 8796v5.txt, 8796v7.txt
>
>
> Calls in the admin that only get status should re-try.
> got a call stack like:
> {code}
> org.apache.hadoop.hbase.exceptions.PleaseHoldException: org.apache.hadoop.hbase.exceptions.PleaseHoldException:
Master is initializing
> 	at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2266)
> 	at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1610)
> 	at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1646)
> 	at org.apache.hadoop.hbase.protobuf.generated.MasterAdminProtos$MasterAdminService$2.callBlockingMethod(MasterAdminProtos.java:20930)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2122)
> 	at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1829)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
> 	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
> 	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
> 	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:230)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:2705)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin.execute(HBaseAdmin.java:2674)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:524)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:417)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:349)
> 	at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator.createSchema(IntegrationTestBigLinkedList.java:437)
> 	at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator.runGenerator(IntegrationTestBigLinkedList.java:471)
> 	at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator.run(IntegrationTestBigLinkedList.java:505)
> 	at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Loop.runGenerator(IntegrationTestBigLinkedList.java:698)
> 	at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Loop.run(IntegrationTestBigLinkedList.java:748)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedListWithChaosMonkey.testContinuousIngest(IntegrationTestBigLinkedListWithChaosMonkey.java:80)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:601)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 	at org.junit.runners.Suite.runChild(Suite.java:127)
> 	at org.junit.runners.Suite.runChild(Suite.java:26)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 	at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
> 	at org.junit.runner.JUnitCore.run(JUnitCore.java:138)
> 	at org.junit.runner.JUnitCore.run(JUnitCore.java:117)
> 	at org.apache.hadoop.hbase.IntegrationTestsDriver.doWork(IntegrationTestsDriver.java:111)
> 	at org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:108)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> 	at org.apache.hadoop.hbase.IntegrationTestsDriver.main(IntegrationTestsDriver.java:47)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message