hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Qiang Tian (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11902) RegionServer was blocked while aborting
Date Thu, 27 Nov 2014 06:50:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14227322#comment-14227322

Qiang Tian commented on HBASE-11902:

ok. the latest failure is because, in the testcase, only WAL write fails, if we hide the exception(
just decrement the counter) and continues, the data flush will succeed, so completeCacheFlush
call decrement it again!.
to preserve the counter semantics, simple is the best --- return right away.(the original

> RegionServer was blocked while aborting
> ---------------------------------------
>                 Key: HBASE-11902
>                 URL: https://issues.apache.org/jira/browse/HBASE-11902
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, wal
>    Affects Versions: 0.98.4
>         Environment: hbase-0.98.4, hadoop-2.3.0-cdh5.1, jdk1.7
>            Reporter: Victor Xu
>            Assignee: Qiang Tian
>         Attachments: hbase-hadoop-regionserver-hadoop461.cm6.log, hbase11902-master.patch,
hbase11902-master_v2.patch, jstack_hadoop461.cm6.log
> Generally, regionserver automatically aborts when isHealth() returns false. But it sometimes
got blocked while aborting. I saved the jstack and logs, and found out that it was caused
by datanodes failures. The "regionserver60020" thread was blocked while closing WAL. 
> This issue doesn't happen so frequently, but if it happens, it always leads to huge amount
of requests failure. The only way to do is KILL -9.
> I think it's a bug, but I haven't found a decent solution. Does anyone have the same

This message was sent by Atlassian JIRA

View raw message