hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "gaojinchao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
Date Wed, 24 Aug 2011 00:19:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089890#comment-13089890
] 

gaojinchao commented on HBASE-4124:
-----------------------------------

RS isn't dead. I can reproduce and verify it.

ZK status has changed before adding to RIT set. You can look the function processDeadServers.
That is the reason why a region is assigned twice. 

        // If region was in transition (was in zk) force it offline for reassign
        try {
          //Process with existing RS shutdown code  
          boolean assign =
            ServerShutdownHandler.processDeadRegion(regionInfo, result, this,
              this.catalogTracker);
          if (assign) {
            ZKAssign.createOrForceNodeOffline(watcher, regionInfo,
              master.getServerName()); 
          }



> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already
online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch,
log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already
online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive
the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message