hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/MigrationToNewCluster" by NatHarward
Date Mon, 21 Jun 2010 02:24:43 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/MigrationToNewCluster" page has been changed by NatHarward.
The comment on this change is: Adding the rest of the story..
http://wiki.apache.org/hadoop/Hbase/MigrationToNewCluster?action=diff&rev1=1&rev2=2

--------------------------------------------------

  = Migrating an Hbase instance to a new cluster =
  
- ''Note: these steps were performed on a trivially small Hbase instance to start with, results
on a larger migration forthcoming''
+ ''Note: these steps were performed on a trivially small Hbase instance to start with, results
on the larger "real" migration forthcoming''
  
- The steps below were taken to successfully migrate a complete and working Hbase instance
to a brand new, clean cluster in a different location (no shared hardware).  The initial setup
was as follows:
+ This is a Hbase users account of migrating data from one data center to another.  The steps
below were taken to successfully (and non-destructively) migrate a complete and working Hbase
instance to a brand new, clean cluster in a different location (no shared hardware).  The
initial setup was as follows:
  
+  * Assumptions
+   * You have network connectivity between the two clusters
+   * The new environment is a working environment but pristine, at least for the HDFS and
Zookeeper roots you use for Hbase
+   * Familiarity with Hbase and HDFS administration
+   * At least one working Map/Reduce cluster available
   * Existing cluster (data center 1)
    * HDFS (v. 0.20.2)
    * Zookeeper (v. 3.2.2)
    * Hbase (v. 0.20.3)
-  * New cluster (data center 2)
+  * New cluster (same software versions, data center 2)
+   * HDFS
+   * '''Map/Reduce''' (this was not part of the original cluster, but is necessary to have
at least one Map/Reduce cluster available)
+   * Zookeeper
+   * Hbase
  
- (to be continued, small interruption...)
+ The steps:
  
+  # Get existing Hbase into a stable state
+   # Disable access to the existing Hbase, so that no data is changing (optionally disabling
your tables just to be sure)
+   # Major compact all tables
+   # Flush all tables
+   # Shut down Hbase
+  # Get existing HDFS into a stable state
+   # Disable any other HDFS access besides Hbase which should now be shut down
+   # Enter HDFS safe-mode
+   # Run ''fsck'' and verify that everything is ok
+  # Push Hbase data from the old cluster to the new one
+   # Use the ''distcp'' command to copy your Hbase root directory to the new cluster (this
command uses Map/Reduce, so it should be available where you run this command)
+   # ''Note: when I tried copying the root of the entire HDFS tree I got a NPE, but pushing
a top-level directory worked fine''
+  # Verify data in the new HDFS cluster
+   # I did a spot-check of file sizes and directories to my satisfaction -- depending on
how important your data is you may want some kind of checksumming crawler for verification
+   # Run ''fsck'' to make sure HDFS is happy and does not think anything is wrong
+  # Fire up Hbase in the new cluster
+   # Make sure your configuration points to the ''new'' Hbase root directory in the new cluster
+  # Verify Hbase connectivity, and again to your satisfaction verify the data accessible
through your new Hbase instance
+  # Finito!
+ 
+ I specifically did not attempt to migrate any Zookeeper data, and found that I did not need
to -- I did not encounter any problems in skipping this but your mileage may vary.
+ 
+ == If something goes wrong ==
+ 
+ Your old cluster should still be intact, though in read-only mode.  To bring it back online,
make sure HDFS is out of safe mode (if you put it there), then ensure Hbase is running and
your tables are enabled.
+ 

Mime
View raw message