hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: HBase export/import
Date Wed, 10 Feb 2010 03:07:48 GMT
You can also use distcp instead of the copyToLocal && scp &&
copyFromLocal chain you have there.

On Tue, Feb 9, 2010 at 7:02 PM, Dan Washusen <dan@reactive.org> wrote:
> +1
>
> I use this method when performance testing on different data sets.  I have
> several datasets to test on (varying sizes, etc).   When I want to switch
> datasets I just shut down hbase and rename the /hbase directory...
>
> e.g. (assuming hbase is not running)
> hadoop/bin/hadoop fs -mv /hbase /hbase.small
> hadoop/bin/hadoop fs -mv /hbase.large /hbase
>
> When I want to move my data between clusters I use:
> hadoop/bin/hadoop fs -copyToLocal /hbase.large /tmp/hbase.large
> scp -r /tmp/hbase.large user@host:/tmp
> ssh user@host
> hadoop/bin/hadoop fs -put /tmp/hbase.large /hbase
>
>
> Very handy :)
>
>
> On 10 February 2010 13:50, Ryan Rawson <ryanobjc@gmail.com> wrote:
>
>> If you stop the source cluster then you can distcp the /hbase to the
>> other cluster. Done. A perfect copy.
>>
>> That is probably the most efficient/highest performing way.
>>
>> On Tue, Feb 9, 2010 at 6:47 PM, James Baldassari <james@dataxu.com> wrote:
>> > Hi,
>> >
>> > I'm wondering if it's possible to export all data from one HBase cluster
>> > and import it into another.  We have a lot of data that we've imported
>> > into our staging HBase environment, and rather than repeating the
>> > lengthy import process in our production environment we would prefer to
>> > just copy all the data directly from HBase/HDFS in staging into
>> > production.  Is there an easy way to do this?  I know Hadoop has some
>> > distributed copy functionality, but I don't know if this will work with
>> > HBase.  The number of region servers and the replication factor will be
>> > the same in the source and destination environments, but the
>> > hostnames/IPs will be different.  The production environment is
>> > completely empty right now, so we don't need to worry about overwriting
>> > data.
>> >
>> > I came across these links while searching for information HBase
>> > export/import:
>> >
>> > http://issues.apache.org/jira/browse/HBASE-897
>> > http://issues.apache.org/jira/browse/HBASE-1684
>> >
>> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/Export.html
>> >
>> > Has anyone used these tools?  Is there a better way?
>> >
>> > Thanks,
>> > James
>> >
>> >
>> >
>>
>

Mime
View raw message