hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: A better way to migrate the whole cluster?
Date Fri, 15 Aug 2014 17:52:03 GMT
Try to add these: 


-Dhbase.client.scanner.caching=100
-Dmapred.map.tasks.speculative.execution=false


Also, as others pointed out, what's the bandwidth between the two clusters?



________________________________
 From: tobe <tobeg3oogle@gmail.com>
To: dev@hbase.apache.org; lars hofhansl <larsh@apache.org> 
Sent: Thursday, August 14, 2014 11:24 PM
Subject: Re: A better way to migrate the whole cluster?
 

Thank @lars.

We're using HBase 0.94.11 and follow the instruction to run `./bin/hbase
org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr=hbase://cluster_name
table_name`. We have namespace service to find the ZooKeeper with
"hbase://cluster_name". And the job ran on a shared yarn cluster.

The performance is affected by many factors, but we haven't found out the
reason. It would be great to see your suggestions.



On Fri, Aug 15, 2014 at 1:34 PM, lars hofhansl <larsh@apache.org> wrote:

> What version of HBase? How are you running CopyTable? A day for 1.8T is
> not what we would expect.
> You can definitely take a snapshot and then export the snapshot to another
> cluster, which will move the actual files; but CopyTable should not be so
> slow.
>
>
> -- Lars
>
>
>
> ________________________________
>  From: tobe <tobeg3oogle@gmail.com>
> To: "user@hbase.apache.org" <user@hbase.apache.org>
> Cc: dev@hbase.apache.org
> Sent: Thursday, August 14, 2014 8:18 PM
> Subject: A better way to migrate the whole cluster?
>
>
> Sometimes our users want to upgrade their servers or move to a new
> datacenter, then we have to migrate the data from HBase. Currently we
> enable the replication from the old cluster to the new cluster, and run
> CopyTable to move the older data.
>
> It's a little inefficient. It takes more than one day to migrate 1.8T data
> and more time to verify. Can we have a better way to do that, like snapshot
> or purely HDFS files?
>
> And what's the best practise or your valuable experience?
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message