hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "philo vivero (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-8105) RegionServer Doesn't Rejoin Cluster after Netsplit
Date Thu, 14 Mar 2013 15:02:13 GMT
philo vivero created HBASE-8105:

             Summary: RegionServer Doesn't Rejoin Cluster after Netsplit
                 Key: HBASE-8105
                 URL: https://issues.apache.org/jira/browse/HBASE-8105
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.92.1
         Environment: Linux Ubuntu 10.04 LTS
            Reporter: philo vivero

Running a 15-node HBase cluster. Testing various failure scenarios. Segregate one RegionServer
from the cluster by firewalling off every port except SSH (because we need to be able to re-enable
the node later).

After the RS is automatically removed from the cluster, we re-enable all ports again, but
RS never rejoins the cluster.

I suspect the possibility this is desired behaviour, but haven't found proof so far. The code
doesn't have any comment indicating this is the behaviour desired:


See lines starting at 624, public void run(). It makes it through the first try/catch block,
but then loops inside the second try/catch block. Our hypothesis is that it never gets out

If we bounce the RegionServer process, then it rejoins the cluster.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message