hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/MigrationToNewCluster" by Misty
Date Fri, 16 Oct 2015 03:35:56 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/MigrationToNewCluster" page has been changed by Misty:
https://wiki.apache.org/hadoop/Hbase/MigrationToNewCluster?action=diff&rev1=5&rev2=6

- = Migrating an Hbase instance to a new cluster =
+ The HBase Wiki is in the process of being decommissioned. The info that used to be on this
page has moved to http://hbase.apache.org/book.html#_import. Please update your bookmarks.
  
- ''Note: these steps were performed on a trivially small Hbase instance to start with, results
on the larger "real" migration forthcoming''
- 
- '''''Update''': the large(r) Hbase instance worked just as well, roughly 350GB on HDFS.
Not large by Hadoop standards, but non-trivial nonetheless.''
- 
- This is a Hbase users account of migrating data from one data center to another.  The steps
below were taken to successfully (and non-destructively) migrate a complete and working Hbase
instance to a brand new, clean cluster in a different location (no shared hardware).  The
initial setup was as follows:
- 
-  * Assumptions
-   * You have network connectivity between the two clusters
-   * The new environment is a working environment but pristine, at least for the HDFS and
Zookeeper roots you use for Hbase
-   * Familiarity with Hbase and HDFS administration
-   * At least one working Map/Reduce cluster available
-  * Existing cluster (data center 1)
-   * HDFS (v. 0.20.2)
-   * Zookeeper (v. 3.2.2)
-   * Hbase (v. 0.20.3)
-  * New cluster (same software versions, data center 2)
-   * HDFS
-   * '''Map/Reduce''' (this was not part of the original cluster, but is necessary to have
at least one Map/Reduce cluster available)
-   * Zookeeper
-   * Hbase
- 
- The steps:
- 
-  I. Get existing Hbase into a stable state
-   i. Disable access to the existing Hbase, so that no data is changing (optionally disabling
your tables just to be sure)
-   i. Major compact all tables
-   i. Flush all tables
-   i. Shut down Hbase
-  I. Get existing HDFS into a stable state
-   i. Disable any other HDFS access besides Hbase which should now be shut down
-   i. Enter HDFS safe-mode
-   i. Run ''fsck'' and verify that everything is ok
-  I. Push Hbase data from the old cluster to the new one
-   i. Use the ''distcp'' command to copy your Hbase root directory to the new cluster (this
command uses Map/Reduce, so it should be available where you run this command)
-   i. ''Note: when I tried copying the root of the entire HDFS tree I got a NPE, but pushing
a top-level directory worked fine''
-  I. Verify data in the new HDFS cluster
-   i. I did a spot-check of file sizes and directories to my satisfaction -- depending on
how important your data is you may want some kind of checksumming crawler for verification
-   i. Run ''fsck'' to make sure HDFS is happy and does not think anything is wrong
-  I. Fire up Hbase in the new cluster
-   i. Make sure your configuration points to the ''new'' Hbase root directory in the new
cluster and your ''new'' Zookeeper instance
-   i. Enable all your tables if you disabled them originally
-  I. Verify Hbase connectivity, and again to your satisfaction verify the data accessible
through your new Hbase instance
-  I. Finito!
- 
- I specifically did not attempt to migrate any Zookeeper data, and found that I did not need
to -- I did not encounter any problems in skipping this but your mileage may vary.
- 
- == If something goes wrong ==
- 
- Your old cluster should still be intact, though in read-only mode.  To bring it back online,
make sure HDFS is out of safe mode (if you put it there), then ensure Hbase is running and
your tables are enabled.
- 

Mime
View raw message