hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5237) Addendum for HBASE-5160 and HBASE-4397
Date Mon, 23 Jan 2012 04:39:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190869#comment-13190869
] 

ramkrishna.s.vasudevan commented on HBASE-5237:
-----------------------------------------------

@Stack
Sorry if you feel the check in is not correct.  But pls find the analysis and scenario as
to why this fix is needed.
As per HBASE-4397 if after preparing a region plan and sending an RPC to open if there are
no RS alive we get an exception where we check for getting a new region plan and if that is
null we set the 
{code}
this.timeoutMonitor.setAllRegionServersOffline(true);
{code}

This patch also does the same thing but the difference is even before sending an RPC if we
find a null region plan it means no servers are alive.  

bq.And then in this case we set a flag up in TM. But TM only runs every 30minutes so the setting
of this flag doesn't do much?
As per the patch in HBASE-4397, if the TM timeout is not elapsed we have added another check
which will help in assigning regions earlier.
{code}
 if (regionState.getStamp() + timeout <= now) {
            actOnTimeOut(unassigns, assigns, regionState, regionInfo);
          }
          else if(this.allRegionServersOffline && !allRSsOffline){
            actOnTimeOut(unassigns, assigns, regionState, regionInfo);            
          }
{code}

Even if the setting of offline servers is done in the middle of assign the current code was
returning null making the assign to wait for TM.
Now that is avoided and it will try to do an assign again if any RS comes alive soon after
this.
We got this problem in our cluster and after verifying the patch i had uploaded it.  Please
correct me if am wrong.
                
> Addendum for HBASE-5160 and HBASE-4397
> --------------------------------------
>
>                 Key: HBASE-5237
>                 URL: https://issues.apache.org/jira/browse/HBASE-5237
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.5
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0, 0.90.6
>
>         Attachments: HBASE-5237_0.90.patch, HBASE-5237_trunk.patch
>
>
> As part of HBASE-4397 there is one more scenario where the patch has to be applied.
> {code}
> RegionPlan plan = getRegionPlan(state, forceNewPlan);
>       if (plan == null) {
>         debugLog(state.getRegion(),
>             "Unable to determine a plan to assign " + state);
>         return; // Should get reassigned later when RIT times out.
>       }
> {code}
> I think in this scenario also 
> {code}
> this.timeoutMonitor.setAllRegionServersOffline(true);
> {code}
> this should be done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message