hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Copying data from one Hbase cluster to Another Hbase cluster
Date Sat, 15 Feb 2014 10:20:01 GMT
Note that a long-running MR service is not a requirement, and that MR
can be used just as a speedy facilitator. Nothing's gonna go wrong if
you shutdown your MR services right after your parallel copy (via
distcp/etc.) has completed.

On Sat, Feb 15, 2014 at 9:39 AM, divye sheth <divs.sheth@gmail.com> wrote:
> You could try the hadoop distcp command to transfer the hbase directory
> from one cluster to other. This does not require u to setup mapreduce, it
> will start a mapred job in local mode i.e. single mapper. When copying from
> one cluster to another remember not to copy -ROOT- and .META.
> I have used this method without facing any data loss. After the copy is
> complete start ur new hbase it should be able to read the contents and
> build region infornation from new directory.
>
> Thanks
> D
> On Feb 14, 2014 5:45 PM, "Samir Ahmic" <ahmic.samir@gmail.com> wrote:
>
>> Well that depends on size of your dataset. You can use hadoop -copyToLocal
>> to copy  /hbase directory to local disk or some other storage device that
>> is mounted on your original cluster. Then you can copy /hbase dir to second
>> cluster with hadoop -copyFromLocal . Of course this will require that
>> source and destionation hbase cluster are offline. I have never used this
>> approach but it should work.
>>
>> Regards
>>
>>
>>
>>
>> On Fri, Feb 14, 2014 at 11:15 AM, Vimal Jain <vkjk89@gmail.com> wrote:
>>
>> > Hi Samir,
>> > As far as i know all these techniques require map reduce daemons to be up
>> > on source and destination cluster.
>> > Is there any other solution which does not require map reduce at all ?
>> >
>> >
>> > On Fri, Feb 14, 2014 at 2:41 PM, Samir Ahmic <ahmic.samir@gmail.com>
>> > wrote:
>> >
>> > > Hi Vimal,
>> > >
>> > > I have few options how to move data from one hbase cluster to another:
>> > >
>> > >
>> > >    1. You can use org.apache.hadoop.hbase.mapreduce.Export tool to
>> export
>> > >    tables to HDFS and then you can use hadoop distcp to move data to
>> > > another
>> > >    cluster. When data is place on second cluster you can use
>> > >    org.apache.hadoop.hbase.mapreduce.Import tool to import tables.
>> Please
>> > >     look at http://hbase.apache.org/book.html#export.
>> > >    2. Second option is to us CopyTable tool, please look at:
>> > >    http://hbase.apache.org/book.html#copytable
>> > >    3. Third option is to enable hbase Snapshots,  create table
>> snapshots,
>> > >    and then use ExportSnapshot tool to move them to second cluster.
>> When
>> > >    snapshots are on second cluster you can clone tables from snapshots.
>> > > Please
>> > >    look: http://hbase.apache.org/book.html#ops.snapshots
>> > >
>> > > I was using 1 and 3 for moving data between clusters and i in my case 3
>> > was
>> > > better solution.
>> > >
>> > > Regards
>> > > Samir
>> > >
>> > >
>> > >
>> > > On Fri, Feb 14, 2014 at 8:33 AM, Vimal Jain <vkjk89@gmail.com> wrote:
>> > >
>> > > > Hi,
>> > > > I have Hbase and Hadoop setup in pseudo distributed mode in
>> production.
>> > > > Now i am planning to move from pseudo distributed mode to fully
>> > > distributed
>> > > > mode ( 2 node cluster).
>> > > > My existing Hbase and Hadoop version are 1.1.2  and  0.94.7.
>> > > > And i am planning to have full distributed mode with Hbase version
>> > > 0.94.16
>> > > > and Hadoop version ( either 1.X or 2.X , not yet decided ).
>> > > >
>> > > > What are different ways to copy data from existing setup ( pseudo
>> > > > distributed mode ) to this new setup ( 2 node fully distributed
>> mode).
>> > > >
>> > > > Please help.
>> > > >
>> > > > --
>> > > > Thanks and Regards,
>> > > > Vimal Jain
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks and Regards,
>> > Vimal Jain
>> >
>>



-- 
Harsh J

Mime
View raw message