hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8144) Limit number of attempts to assign a region
Date Fri, 22 Mar 2013 03:57:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609876#comment-13609876
] 

ramkrishna.s.vasudevan commented on HBASE-8144:
-----------------------------------------------

Should we synchronize on failedOpenTracker where we update and remove this Concurrent hash
map? Overall patch look very good.

[~jxiang]
Actually this scenario i have recently seen in 0.94 where Lars had shared me some logs where
the region opening was failing because the Compression codec while trying to open the region
on the RS side was not found.

So this change will atleast avoid the continuous rebouncing of assignment between master and
RS.
HBASE-8049 is to do that.  After this patch i think we can make that issue to work like this,
In case of FAILED_OPEN- can we add the exception msg or the reason why it failed and add it
in the znode so that once we complete the retrying we try to use that info and prompt the
user about the problem.
Let me take up more on that JIRA.
Coming  back to this JIRA,
So once this retries are completed how do we again reassign the region?  Just in case.
                
> Limit number of attempts to assign a region
> -------------------------------------------
>
>                 Key: HBASE-8144
>                 URL: https://issues.apache.org/jira/browse/HBASE-8144
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Jimmy Xiang
>            Assignee: Jimmy Xiang
>            Priority: Minor
>             Fix For: 0.95.0, 0.98.0
>
>         Attachments: trunk-8144.patch
>
>
> In sending a region open request to a region server, we make sure we try at most some
configured times.  However, once the request is accepted by the region server, the region
could go through this transition forever: failed_open (in ZK) => closed => opening =>
failed_open (in ZK), assuming no RPC/network issue.
> It will be good to break the loop and limit the number of tries and move the region to
failed_open state (will be introduced in HBASE-8137)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message