hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: HBase export/import
Date Wed, 10 Feb 2010 17:27:23 GMT
This is also how you can run multiple HBase instances on top of a single
HDFS spanning all:

   One HBase rootdir at hdfs://host:port/hbase-foo
   Another HBase rootdir at hdfs://host:port/hbase-bar

etc. 

And you can run each HBase instance under a different user account and
use HDFS permissions as a very coarse way to keep the data of one HBase
private from another. This is a (lame) solution for multitenancy until
HBASE-1697.

Just a random thought, 

   - Andy


----- Original Message ----
> From: Dan Washusen <dan@reactive.org>
> To: hbase-user@hadoop.apache.org
> Sent: Tue, February 9, 2010 7:02:22 PM
> Subject: Re: HBase export/import
> 
> +1
> 
> I use this method when performance testing on different data sets.  I have
> several datasets to test on (varying sizes, etc).   When I want to switch
> datasets I just shut down hbase and rename the /hbase directory...
> 
> e.g. (assuming hbase is not running)
> hadoop/bin/hadoop fs -mv /hbase /hbase.small
> hadoop/bin/hadoop fs -mv /hbase.large /hbase
> 
> When I want to move my data between clusters I use:
> hadoop/bin/hadoop fs -copyToLocal /hbase.large /tmp/hbase.large
> scp -r /tmp/hbase.large user@host:/tmp
> ssh user@host
> hadoop/bin/hadoop fs -put /tmp/hbase.large /hbase
> 
> 
> Very handy :)
> 
> 
> On 10 February 2010 13:50, Ryan Rawson wrote:
> 
> > If you stop the source cluster then you can distcp the /hbase to the
> > other cluster. Done. A perfect copy.
> >
> > That is probably the most efficient/highest performing way.
> >
> > On Tue, Feb 9, 2010 at 6:47 PM, James Baldassari wrote:
> > > Hi,
> > >
> > > I'm wondering if it's possible to export all data from one HBase cluster
> > > and import it into another.  We have a lot of data that we've imported
> > > into our staging HBase environment, and rather than repeating the
> > > lengthy import process in our production environment we would prefer to
> > > just copy all the data directly from HBase/HDFS in staging into
> > > production.  Is there an easy way to do this?  I know Hadoop has some
> > > distributed copy functionality, but I don't know if this will work with
> > > HBase.  The number of region servers and the replication factor will be
> > > the same in the source and destination environments, but the
> > > hostnames/IPs will be different.  The production environment is
> > > completely empty right now, so we don't need to worry about overwriting
> > > data.
> > >
> > > I came across these links while searching for information HBase
> > > export/import:
> > >
> > > http://issues.apache.org/jira/browse/HBASE-897
> > > http://issues.apache.org/jira/browse/HBASE-1684
> > >
> > 
> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/Export.html
> > >
> > > Has anyone used these tools?  Is there a better way?
> > >
> > > Thanks,
> > > James
> > >
> > >
> > >
> >



      


Mime
View raw message