hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5973) Add ability for potentially long-running IPC calls to abort if client disconnects
Date Thu, 10 May 2012 22:22:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272845#comment-13272845
] 

stack commented on HBASE-5973:
------------------------------

Might Todd.  When we call this:

{code}
     private boolean nextInternal(int limit, String metric) throws IOException {
+      RpcCallContext rpcCall = HBaseServer.getCurrentCall();
       while (true) {
+        if (rpcCall != null) {
+          // If a user specifies a too-restrictive or too-slow scanner, the
+          // client might time out and disconnect while the server side
+          // is still processing the request. We should abort aggressively
+          // in that case.
+          rpcCall.throwExceptionIfCallerDisconnected();
+        }
{code}

... if connection is closed when we check, the exception does not come out here and abort
this current nextInternal invocation?  Rather, it comes out on the stuck handler?  Does it
interrupt the ongoing call, the one w/o a client?

Pardon dumb question.  Just trying to understand how this fix works.
                
> Add ability for potentially long-running IPC calls to abort if client disconnects
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-5973
>                 URL: https://issues.apache.org/jira/browse/HBASE-5973
>             Project: HBase
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: hbase-5973-0.92.txt, hbase-5973-0.94.txt, hbase-5973-0.94.txt, hbase-5973.txt,
hbase-5973.txt, hbase-5973.txt
>
>
> We recently had a cluster issue where a user was submitting scanners with a very restrictive
filter, and then calling next() with a high scanner caching value. The clients would generally
time out the next() call and disconnect, but the IPC kept running looking to fill the requested
number of rows. Since this was in the context of MR, the tasks making the calls would retry,
and the retries wuld be more likely to time out due to contention with the previous still-running
scanner next() call. Eventually, the system spiraled out of control.
> We should add a hook to the IPC system so that RPC calls can check if the client has
already disconnected. In such a case, the next() call could abort processing, given any further
work is wasted. I imagine coprocessor endpoints, etc, could make good use of this as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message