hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/MultipleMasters" by Misty
Date Fri, 16 Oct 2015 03:18:28 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/MultipleMasters" page has been changed by Misty:

- '''This document is still a draft'''
+ The HBase Wiki is in the process of being decommissioned. The info that used to be on this
page has moved to http://hbase.apache.org/book.html#quickstart_fully_distributed. Please update
your bookmarks.
- Since version 0.20.0 HBase supports multiple Masters to provide higher availability. It
works in the same way that Bigtable does as explained in the 2006 paper. This page contains
the information you need to set it up, maintain it, and to understand how it works under the
- == Single Master Setup ==
- The [[http://hadoop.apache.org/hbase/docs/current/api/overview-summary.html#overview_description|Getting
Started]] documentation describes set up with a single Master. A cluster can run without a
Master for a few minutes but your regions will be unable to split. If it does happen, you
can go to any other machine with the correct installation/configuration and do {{{$ ${HBASE_HOME}/bin/hbase-daemon.sh
start master}}}. This newly started Master will take over Master functions.
- Currently the Hadoop Distributed Filesystem is '''not''' highly available so if the Namenode
resides on the same machine as your Master, the cluster is still wedged and you will have
to shut down HBase with a high probability of losing data.
- == Multiple Masters Setup ==
- Before setting up multiple Masters, you should already have built an HBase cluster with
a single Master. If not, please refer to the [[http://hadoop.apache.org/hbase/docs/current/api/overview-summary.html#overview_description|Getting
Started]] documentation.
- === Basic knowledge ===
- The multi-master feature introduced in 0.20.0 does not add cooperating Masters; there is
still just one working Master while the other ''backups'' wait. For example, if you start
200 Masters only 1 will be active while the others wait for it to die. The switch usually
takes `zookeeper.session.timeout` plus a couple of seconds to occur. See "How it works inside"
below for more information.
- === Designing your highly available setup ===
- The rule of thumb here is to not put all your eggs in the same basket. You don't want a
Namenode and a Master on the same machine because currently you can recover automatically
from a Master failure but not from a Namenode failure. Be sure that the Namenode has its own
very reliable machine until Hadoop 0.21 comes in with ''Backup Namenodes''. Also you don't
want to have a Region Server and a Master on the same node, as that machine failure will imply
first a Master failover and then the new Master will have to split the logs of the failed
- Your ideal highly available cluster would have 5 or more dedicated Zookeeper servers, 2-3
dedicated Master servers (one per rack for example), 1 very reliable Namenode/Job Tracker
server with redundant hardware and the rest is the usual Datanode/Task Tracker/Region Server
stack. If you don't even have twice that amount of machines, you will have to evaluate some
trade-offs. For example, you could try to keep a dedicated Master server and put the others
along the Region Servers as the failure of a backup Master doesn't have any impact and you
could do the same for the ZK servers.
- === Managing the Masters ===
- Currently handling the other Masters isn't really user friendly but it's getting worked
on. When you start HBase, your first main Master will also be started. To start other Masters
do {{{$ ${HBASE_HOME}/bin/hbase-daemon.sh start master}}} on all the nodes you want to, as
long as the have the correct installation/configuration. You could also do {{{$ ${HBASE_HOME}/bin/hbase-daemons.sh
start master}}} and that would start a Master on every machine listed in ­­{{{conf/regionserver}}}.
- To stop any Master '''without shutting down HBase''', you currently have to {{{kill -9}}}
it (This is OK.  All state is maintained elsewhere off in ZooKeeper and out on RegionServers).
If you kill the active Master, first make sure it's not splitting logs as you could lose data.
To check that, tail the Master's log and watch for anything that says "Splitting logs # of
- == How it works inside ==

View raw message