hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victor Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14359) HTable#close will hang forever if unchecked error/exception thrown in AsyncProcess#sendMultiAction
Date Thu, 03 Sep 2015 12:03:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728915#comment-14728915
] 

Victor Xu commented on HBASE-14359:
-----------------------------------

Add jstack output for this issue:
{noformat}
"Thread-14" daemon prio=10 tid=0x00007fc9f7c0d800 nid=0x125b in Object.wait() [0x0000000043acc000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at org.apache.hadoop.hbase.client.AsyncProcess.waitForNextTaskDone(AsyncProcess.java:988)
        - locked <0x0000000788126080> (a java.util.concurrent.atomic.AtomicLong)
        at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1014)
        at org.apache.hadoop.hbase.client.AsyncProcess.waitUntilDone(AsyncProcess.java:1027)
        at org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:1092)
        - locked <0x0000000788126168> (a java.lang.Object)
        at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1424)
        at org.apache.hadoop.hbase.client.HTable.close(HTable.java:1461)
        at com.alibaba.search.offline.sync.common.hbase.HBaseTable.returnHTable(HBaseTable.java:136)
        at com.alibaba.search.offline.sync.common.hbase.HBaseTable.batchPut(HBaseTable.java:185)
        at com.alibaba.search.offline.sync.common.hbase.HBaseTable.batchPut(HBaseTable.java:159)
        at com.alibaba.search.offline.sync.sync.storage.HBaseStorageHandler.multiPut(HBaseStorageHandler.java:80)
        at com.alibaba.search.offline.sync.sync.LoaderProcessor.doEveryTable(LoaderProcessor.java:130)
        at com.alibaba.search.offline.sync.sync.LoaderProcessor.execute(LoaderProcessor.java:50)
        at com.alibaba.search.offline.sync.sync.ProcessorRunner.perform(ProcessorRunner.java:58)
        at com.alibaba.search.offline.sync.sync.DaemonWorker.daemonWork(DaemonWorker.java:51)
        - locked <0x00000007809c4e48> (a com.alibaba.search.offline.sync.sync.ProcessorRunner)
        at com.alibaba.search.offline.sync.sync.DaemonWorker.run(DaemonWorker.java:31)
{noformat}
All user processes hang in the while loop of AsyncProcess#waitForNextTaskDone.

> HTable#close will hang forever if unchecked error/exception thrown in AsyncProcess#sendMultiAction
> --------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14359
>                 URL: https://issues.apache.org/jira/browse/HBASE-14359
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.14, 1.1.2
>            Reporter: Yu Li
>            Assignee: Victor Xu
>         Attachments: HBASE-14359-0.98-v1.patch, HBASE-14359-master-branch1-v1.patch
>
>
> Currently in AsyncProcess#sendMultiAction, we only catch the RejectedExecutionException
and let other error/exception go, which will cause decTaskCounter not invoked. Meanwhile,
the recommendation for using HTable is to close the table in the finally clause, and HTable#close
will call flushCommits and wait until all task done.
> The problem is when unchecked error/exception like OutOfMemoryError thrown, taskSent
will never be equal to taskDone, so AsyncProcess#waitUntilDone will never return. Especially,
if autoflush is set thus no data to flush during table close, there would be no rpc call so
rpcTimeOut will not break the call, and thread will wait there forever.
> In our product env, the unchecked error we observed is "java.lang.OutOfMemoryError: unable
to create new native thread", and we observed the client thread hang for hours



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message