hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prakash Khemani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3828) region server stuck in waitOnAllRegionsToClose
Date Thu, 05 May 2011 17:36:03 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029452#comment-13029452

Prakash Khemani commented on HBASE-3828:

In my cluster this turned out to be a problem in code that we had modified internally. In
the region server abort code path we had put in a check that if filesystem is unavailable
then do not try to close regions. But the main thread anyway went ahead and waited for the
regions to close. That was causing the hang in waitOnAllRegionsToClose(). (aside - there is
an internal task on this ... when append to HLog fails, hbase relies on dfsclient to close
the filesystem for the regionserver abort to be triggered. That is very roundabout and there
ought to be more direct and synchronous abort facility)


It is possible that there is no further synchronization necessary when a region is being opened.
But I haven't looked at the code closely enough. What happens between the time when zk node
is closed and the region is actually closed on the rs? When is the region removed fron onlineRegions
- is it possible that one thread adds it and the other immediately removes it  ... I will
try and spend some time on it this soon.


> region server stuck in waitOnAllRegionsToClose
> ----------------------------------------------
>                 Key: HBASE-3828
>                 URL: https://issues.apache.org/jira/browse/HBASE-3828
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Prakash Khemani

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message