hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "huaxiang sun (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-17889) ResultBoundedCompletionService's cancel() needs to interrupt the working thread and free it to the thread-pool
Date Thu, 06 Apr 2017 22:33:42 GMT
huaxiang sun created HBASE-17889:
------------------------------------

             Summary: ResultBoundedCompletionService's cancel() needs to interrupt the working
thread and free it to the thread-pool
                 Key: HBASE-17889
                 URL: https://issues.apache.org/jira/browse/HBASE-17889
             Project: HBase
          Issue Type: Bug
          Components: Client
    Affects Versions: 2.0.0, 1.4.0, 1.2.6, 1.3.2
            Reporter: huaxiang sun
            Assignee: huaxiang sun


We run into one case with read-replica, when the server hosting the primary region is shutdown,
we see Get did not go to replica region and it paused for about 50 seconds before Get was
resumed. 

More debugging finds out that when the server is down, one of the threads was stuck at the
write, it holds lock at 
https://github.com/apache/hbase/blob/branch-1.3/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java#L916.
The later write threads were waiting on this lock until all threads in the connection's thread
pool were stuck on this lock. At that moment, no work will be done. After socket write times
out, it frees up all threads and it continues.

When QueueingFuture#cancel() is called, it does not interrupt the working thread and return
the thread to the pool.

Attaching the jstack trace.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message