hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hbase/Migration" by stack
Date Tue, 22 Jan 2008 07:24:04 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/Migration

------------------------------------------------------------------------------
  
   * All hbase data and state is out on the FileSystem: Moving from one version should be
just a case of moving or rewriting files on the FileSystem.
   * Hbase cannot be running when a migration is run.
+   * This can be tricky to assert when the hbase versions differ to such an extent, they
are unable to talk to each other (Caller just hangs and eventually timesout).
   * Sometimes, the amount of on-filesystem data that needs to be changed will be large so
migration will need to run a MR job.
-  * hbase FS image needs versioning.  On startup, hbase will check the FS version.  If awry,
hbase will shut itself down emitting a migration needed message.  Versions are finer-grained
than release number (svn revision?).
+  * hbase FS image needs versioning.  On startup, hbase will check the FS version.  If awry,
hbase will shut itself down emitting a migration needed message.  Versions are finer-grained
than release number.
   * The commit of every incompatible change would be accompanied by a script that can move
hbase across the incompatibility.
   * A migration runs migration scripts in order, from oldest through to latest (Migration
scripts are named in a manner that dictates an order -- or a catalog file lists the order
in which scripts are run).
   * Downtime must be minimal.
   * Migration script will do no damage if run when there is nothing to migrate
  
  == Prerequisites/Dependencies ==
-  * Hbase fast backup to be run before migration to protect against data loss
+  * Hbase fast backup to be run before migration to protect against data loss: See '''./bin/hadoop
distcp'''
  
  == Issues ==
   * Should hbase classes be versioned and know how to migrate themselves?  Seems like excessive
overhead especially for smaller classes H!StoreKey and its like.  If not, how to go between
versions (How to float two versions of same class in same job?).
-   * Maybe overhead wouldn't be that bad.  See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/javadoc/org/apache/hadoop/io/VersionedWritable.html
VersionedWritable].  It uses single byte versioning.
+   * Maybe overhead wouldn't be that bad.  See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/javadoc/org/apache/hadoop/io/VersionedWritable.html
VersionedWritable].  It uses single byte versioning.  H!ColumnDescriptor is already versioned.
  
+ == Implementation/Decisions ==
+ 
+  * The migration script is named o.a.h.hbase.util.Migrate.  Run it by invoking '''${HBASE_HOME}/bin/hbase
migrate'''.
+  * Versions are explicit integers.  The first version is '''1'''.
+  * The version of a particular '''hbase.rootdir''' install is recorded into a file at the
top-level named '''hbase.version'''.
+ 

Mime
View raw message