hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8144) Limit number of attempts to assign a region
Date Tue, 26 Mar 2013 03:51:15 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13613460#comment-13613460
] 

ramkrishna.s.vasudevan commented on HBASE-8144:
-----------------------------------------------

@Jimmy 
Sorry for checking this out late.
I am fine with all change except this because i have one question here :)
{code}
+        // we should retry since we already reset the region state,
+        // existing (re)assignment will fail anyway.
+        if (!server.isAborted()) {
+          continue;
{code}

I agree that we have changed the REGIONSTATE to OFFLINE.  But if the setting of znode to OFFLINE
failed because the RS transitioned it to OPENED, then we actually do not handle this in handledRegion()
because the region state is OFFLINE.  With this patch we may try to reassign to another RS
(may be double assignment), without this patch we may not update the master's memory.

But one more thing that comes to  my mind is after TOM disabled when is this case possible?
 If this case itself is not possible then it is ok.  
                
> Limit number of attempts to assign a region
> -------------------------------------------
>
>                 Key: HBASE-8144
>                 URL: https://issues.apache.org/jira/browse/HBASE-8144
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Jimmy Xiang
>            Assignee: Jimmy Xiang
>            Priority: Minor
>             Fix For: 0.95.0, 0.98.0
>
>         Attachments: trunk-8144.patch, trunk-8144_v2.patch, trunk-8144_v3.patch
>
>
> In sending a region open request to a region server, we make sure we try at most some
configured times.  However, once the request is accepted by the region server, the region
could go through this transition forever: failed_open (in ZK) => closed => opening =>
failed_open (in ZK), assuming no RPC/network issue.
> It will be good to break the loop and limit the number of tries and move the region to
failed_open state (will be introduced in HBASE-8137)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message