hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Baldassari <ja...@dataxu.com>
Subject Re: HBase export/import
Date Wed, 10 Feb 2010 03:48:35 GMT
Wow, I guess it's a lot simpler than I thought it would  be.  I'll give
it a try.  Thank you both for your helpful and quick responses!

-James


On Tue, 2010-02-09 at 21:07 -0600, Ryan Rawson wrote:
> You can also use distcp instead of the copyToLocal && scp &&
> copyFromLocal chain you have there.
> 
> On Tue, Feb 9, 2010 at 7:02 PM, Dan Washusen <dan@reactive.org> wrote:
> > +1
> >
> > I use this method when performance testing on different data sets.  I have
> > several datasets to test on (varying sizes, etc).   When I want to switch
> > datasets I just shut down hbase and rename the /hbase directory...
> >
> > e.g. (assuming hbase is not running)
> > hadoop/bin/hadoop fs -mv /hbase /hbase.small
> > hadoop/bin/hadoop fs -mv /hbase.large /hbase
> >
> > When I want to move my data between clusters I use:
> > hadoop/bin/hadoop fs -copyToLocal /hbase.large /tmp/hbase.large
> > scp -r /tmp/hbase.large user@host:/tmp
> > ssh user@host
> > hadoop/bin/hadoop fs -put /tmp/hbase.large /hbase
> >
> >
> > Very handy :)
> >
> >
> > On 10 February 2010 13:50, Ryan Rawson <ryanobjc@gmail.com> wrote:
> >
> >> If you stop the source cluster then you can distcp the /hbase to the
> >> other cluster. Done. A perfect copy.
> >>
> >> That is probably the most efficient/highest performing way.
> >>
> >> On Tue, Feb 9, 2010 at 6:47 PM, James Baldassari <james@dataxu.com> wrote:
> >> > Hi,
> >> >
> >> > I'm wondering if it's possible to export all data from one HBase cluster
> >> > and import it into another.  We have a lot of data that we've imported
> >> > into our staging HBase environment, and rather than repeating the
> >> > lengthy import process in our production environment we would prefer to
> >> > just copy all the data directly from HBase/HDFS in staging into
> >> > production.  Is there an easy way to do this?  I know Hadoop has some
> >> > distributed copy functionality, but I don't know if this will work with
> >> > HBase.  The number of region servers and the replication factor will be
> >> > the same in the source and destination environments, but the
> >> > hostnames/IPs will be different.  The production environment is
> >> > completely empty right now, so we don't need to worry about overwriting
> >> > data.
> >> >
> >> > I came across these links while searching for information HBase
> >> > export/import:
> >> >
> >> > http://issues.apache.org/jira/browse/HBASE-897
> >> > http://issues.apache.org/jira/browse/HBASE-1684
> >> >
> >> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/Export.html
> >> >
> >> > Has anyone used these tools?  Is there a better way?
> >> >
> >> > Thanks,
> >> > James
> >> >
> >> >
> >> >
> >>
> >


Mime
View raw message