hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/MigrationToNewCluster" by Misty
Date Fri, 16 Oct 2015 03:35:56 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/MigrationToNewCluster" page has been changed by Misty:

- = Migrating an Hbase instance to a new cluster =
+ The HBase Wiki is in the process of being decommissioned. The info that used to be on this
page has moved to http://hbase.apache.org/book.html#_import. Please update your bookmarks.
- ''Note: these steps were performed on a trivially small Hbase instance to start with, results
on the larger "real" migration forthcoming''
- '''''Update''': the large(r) Hbase instance worked just as well, roughly 350GB on HDFS.
Not large by Hadoop standards, but non-trivial nonetheless.''
- This is a Hbase users account of migrating data from one data center to another.  The steps
below were taken to successfully (and non-destructively) migrate a complete and working Hbase
instance to a brand new, clean cluster in a different location (no shared hardware).  The
initial setup was as follows:
-  * Assumptions
-   * You have network connectivity between the two clusters
-   * The new environment is a working environment but pristine, at least for the HDFS and
Zookeeper roots you use for Hbase
-   * Familiarity with Hbase and HDFS administration
-   * At least one working Map/Reduce cluster available
-  * Existing cluster (data center 1)
-   * HDFS (v. 0.20.2)
-   * Zookeeper (v. 3.2.2)
-   * Hbase (v. 0.20.3)
-  * New cluster (same software versions, data center 2)
-   * HDFS
-   * '''Map/Reduce''' (this was not part of the original cluster, but is necessary to have
at least one Map/Reduce cluster available)
-   * Zookeeper
-   * Hbase
- The steps:
-  I. Get existing Hbase into a stable state
-   i. Disable access to the existing Hbase, so that no data is changing (optionally disabling
your tables just to be sure)
-   i. Major compact all tables
-   i. Flush all tables
-   i. Shut down Hbase
-  I. Get existing HDFS into a stable state
-   i. Disable any other HDFS access besides Hbase which should now be shut down
-   i. Enter HDFS safe-mode
-   i. Run ''fsck'' and verify that everything is ok
-  I. Push Hbase data from the old cluster to the new one
-   i. Use the ''distcp'' command to copy your Hbase root directory to the new cluster (this
command uses Map/Reduce, so it should be available where you run this command)
-   i. ''Note: when I tried copying the root of the entire HDFS tree I got a NPE, but pushing
a top-level directory worked fine''
-  I. Verify data in the new HDFS cluster
-   i. I did a spot-check of file sizes and directories to my satisfaction -- depending on
how important your data is you may want some kind of checksumming crawler for verification
-   i. Run ''fsck'' to make sure HDFS is happy and does not think anything is wrong
-  I. Fire up Hbase in the new cluster
-   i. Make sure your configuration points to the ''new'' Hbase root directory in the new
cluster and your ''new'' Zookeeper instance
-   i. Enable all your tables if you disabled them originally
-  I. Verify Hbase connectivity, and again to your satisfaction verify the data accessible
through your new Hbase instance
-  I. Finito!
- I specifically did not attempt to migrate any Zookeeper data, and found that I did not need
to -- I did not encounter any problems in skipping this but your mileage may vary.
- == If something goes wrong ==
- Your old cluster should still be intact, though in read-only mode.  To bring it back online,
make sure HDFS is out of safe mode (if you put it there), then ensure Hbase is running and
your tables are enabled.

View raw message