hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5353) HA/Distributed HMaster via RegionServers
Date Wed, 08 Feb 2012 18:40:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203829#comment-13203829

Jesse Yates commented on HBASE-5353:

I was thinking about this and it seems like it wouldn't be that hard to have each of the regionservers
doing leader election via ZK to select the one (or top 'n' rs) that would spin up master instances
on their local machine. Those new masters could do their own leader election in ZK to determine
who is the current 'official' HMaster, and the others would act as hot failovers. If a master
dies, the next rs in the list would spin up a master instance, ensuring that we always have
a certain number of hot masters (clearly cascading failure here is a problem, but if that
happens, you have bigger problems). Clearly, running the master from the same JVM is probably
a bad idea, but you could potentially even use the startup scripts to spin up a separate jvm
with the master.

This also means some modification to the client, to keep track of the current master, but
that should be fairly trivial, as it already has the zk connection (or can do a fail and lookup).

> HA/Distributed HMaster via RegionServers
> ----------------------------------------
>                 Key: HBASE-5353
>                 URL: https://issues.apache.org/jira/browse/HBASE-5353
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>    Affects Versions: 0.94.0
>            Reporter: Jesse Yates
>            Priority: Minor
> Currently, the HMaster node must be considered a 'special' node (single point of failure),
meaning that the node must be protected more than the other commodity machines. It should
be possible to instead have the HMaster be much more available, either in a distributed sense
(meaning a bit rewrite) or with multiple instances and automatic failover. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message