hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-2937) Facilitate Timeouts In HBase Client
Date Fri, 20 May 2011 05:16:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036677#comment-13036677
] 

jiraposter@reviews.apache.org commented on HBASE-2937:
------------------------------------------------------



bq.  On 2011-05-19 06:11:23, Michael Stack wrote:
bq.  > This seems like a bunch of functionality for a relatively small change.  Nice one
Karthick.  A few questions in the below.
bq.  
bq.  Karthick Sankarachary wrote:
bq.      Yes, it does seem like a big change for a relatively small feature, but an important
one nevertheless. The complexity stems from the fact the scope of the operation timeout has
to be limited to the {{ServerCallable#call}} method. 
bq.      
bq.      By way of motivation, if you run the TestFromClientSide test with the following patch
(which sets the "hbase.rpc.timeout" to 10ms), you'll see that 39 out of the 44 test cases
will fail.
bq.      
bq.           24 --- a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
bq.           25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
bq.           26 @@ -94,6 +94,8 @@ public class TestFromClientSide {
bq.           27    @BeforeClass
bq.           28    public static void setUpBeforeClass() throws Exception {
bq.           29      TEST_UTIL.startMiniCluster(3);
bq.           30 +    TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY,
10);
bq.           32    }
bq.      
bq.      On the other hand, if you run it with the default "hbase.rpc.timeout" but a "hbase.client.operation.timeout"
set to 10ms, then you should see the test pass.
bq.      
bq.           24 --- a/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
bq.           25 +++ b/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
bq.           26 @@ -94,6 +94,8 @@ public class TestFromClientSide {
bq.           27    @BeforeClass
bq.           28    public static void setUpBeforeClass() throws Exception {
bq.           29      TEST_UTIL.startMiniCluster(3);
bq.           30 +    TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_RPC_TIMEOUT_KEY,
60000);
bq.           31 +    TEST_UTIL.getConfiguration().setInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT,
10);
bq.           32    }
bq.

Actually I was saying the opposite.  I'm surprised at how little code had to change to make
this fix.

So, I don't recall if there is good documentation in this patch on the difference between
"hbase.rpc.timeout" and "hbase.client.operation.timeout"?  If not, we need it.

Does the TestFromClientSide complete in shorter time if I set a "hbase.client.operation.timeout"
of 10ms?


bq.  On 2011-05-19 06:11:23, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 96
bq.  > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line96>
bq.  >
bq.  >     Will this work?  What happens if lots of concurrent threads going against lots
of different tables each with a different timeout?  Will a meta table call pick up a short
timeout that was put in place by a near-concurrent edit?
bq.  
bq.  Karthick Sankarachary wrote:
bq.      Good question. Because the {{HBaseRPC#rpcTimeout}} is defined to be {{ThreadLocal}},
it should only apply to the user thread that is performing the {{HTable}} operation. Also,
we take care to reset that thread-specific timeout (make it the default) after the {{ServerCallable#call}
is done. To be strictly correct, I now do the reset in a finally block in {{HConnectionImplementation#getRegionServerWith*Retries}}.

Ok.  If ThreadLocal we should be fine.


bq.  On 2011-05-19 06:11:23, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java, line 106
bq.  > <https://reviews.apache.org/r/755/diff/1/?file=19383#file19383line106>
bq.  >
bq.  >     Are there other exceptions you think we should rethrow?  Connection Exception?
bq.  
bq.  Karthick Sankarachary wrote:
bq.      How about we do what HBaseClient does, which is wrap the SocketTimeoutException inside
another one, along with a context-specific error message?

I was more wondering if there were exceptions we should treat like SocketTimeoutException?


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/755/#review683
-----------------------------------------------------------


On 2011-05-19 23:14:57, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/755/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-05-19 23:14:57)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Thanks to HBASE-3154, users now have the ability to specify a timeout for client-side
RPC calls. However, it doesn't go far enough in terms of how low that timeout can go. Set
the RPC timeout to too low a value and you run the risk of timing out on calls to the meta
tables, which are preconditions to calling the {{HRegionInterface}} proxy.
bq.  
bq.  Given that, I believe the motivation at work in HBASE-2937 still hold true. In this patch,
I add a operation-level timeout, configurable through "hbase.client.operation.timeout", which
will override the value specified by "hbase.rpc.timeout", if any, within the scope of the
{{ServerCallable#call}} method. In other words, the operation-level timeout does not apply
to calls to the meta tables. 
bq.  
bq.  Furthermore, the patch treats an RPC timeout as a non-fatal event, in that it will not
cause the {{HBaseClient#Connection}} instance to be closed. Last but not the least, users
will also have the ability to set the operation timeout on the {{HTable}} on the fly.
bq.  
bq.  
bq.  This addresses bug HBASE-2937.
bq.      https://issues.apache.org/jira/browse/HBASE-2937
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java e9e3694 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b26f41e 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 61e151a 
bq.    src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 6f22123 
bq.    src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 470e741 
bq.    src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java dbb57d9 
bq.    src/main/java/org/apache/hadoop/hbase/util/PoolMap.java 354d49a 
bq.  
bq.  Diff: https://reviews.apache.org/r/755/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Facilitate Timeouts In HBase Client
> -----------------------------------
>
>                 Key: HBASE-2937
>                 URL: https://issues.apache.org/jira/browse/HBASE-2937
>             Project: HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.89.20100621
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: HBASE-2937.patch, HBASE-2937.patch
>
>
> Currently, there is no way to force an operation on the HBase client (viz. HTable) to
time out if a certain amount of time has elapsed.  In other words, all invocations on the
HTable class are veritable blocking calls, which will not return until a response (successful
or otherwise) is received. 
> In general, there are two ways to handle timeouts:  (a) call the operation in a separate
thread, until it returns a response or the wait on the thread times out and (b) have the underlying
socket unblock the operation if the read times out.  The downside of the former approach is
that it consumes more resources in terms of threads and callables. 
> Here, we describe a way to specify and handle timeouts on the HTable client, which relies
on the latter approach (i.e., socket timeouts). Right now, the HBaseClient sets the socket
timeout to the value of the "ipc.ping.interval" parameter, which is also how long it waits
before pinging the server in case of a failure. The goal is to allow clients to set that timeout
on the fly through HTable. Rather than adding an optional timeout argument to every HTable
operation, we chose to make it a property of HTable which effectively applies to every method
that involves a remote operation.
> In order to propagate the timeout  from HTable to HBaseClient, we replaced all occurrences
of ServerCallable in HTable with an extension called ClientCallable, which sets the timeout
on the region server interface, once it has been instantiated, through the HConnection object.
The latter, in turn, asks HBaseRPC to pass that timeout to the corresponding Invoker, so that
it may inject the timeout at the time the invocation is made on the region server proxy. Right
before the request is sent to the server, we set the timeout specified by the client on the
underlying socket.
> In conclusion, this patch will afford clients the option of performing an HBase operation
until it completes or a specified timeout elapses. Note that a timeout of zero is interpreted
as an infinite timeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message