hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: migrate cluster to different datacenter
Date Fri, 03 Aug 2012 21:57:30 GMT
Sorry at 1PB of disk... compression isn't going to really help a whole heck of a lot. Your
networking bandwidth will be your bottleneck.

So lets look at the problem. 

How much down time can you afford? 
What does your hardware look like? 
How much space do you have in your current data center? 

You have 1PB of data. OK, what does the access pattern look like? 

There are a couple of ways to slice and dice this. How many trucks do you have? 

On Aug 3, 2012, at 4:24 PM, Harit Himanshu <harit.subscriptions@gmail.com> wrote:

> Moving 1 PB of data would take loads of time, 
> - check if this new data center provides something similar to http://aws.amazon.com/importexport/
> - Consider multi part uploading of data
> - consider compressing the data
> On Aug 3, 2012, at 2:19 PM, Patai Sangbutsarakum wrote:
>> thanks for response.
>> Physical move is not a choice in this case. Purely looking for copying
>> data and how to catch up with the update of a file while it is being
>> migrated.
>> On Fri, Aug 3, 2012 at 12:40 PM, Chen He <airbots@gmail.com> wrote:
>>> sometimes, physically moving hard drives helps.   :)
>>> On Aug 3, 2012 1:50 PM, "Patai Sangbutsarakum" <silvianhadoop@gmail.com>
>>> wrote:
>>>> Hi Hadoopers,
>>>> We have a plan to migrate Hadoop cluster to a different datacenter
>>>> where we can triple the size of the cluster.
>>>> Currently, our 0.20.2 cluster have around 1PB of data. We use only
>>>> Java/Pig.
>>>> I would like to get some input how we gonna handle with transferring
>>>> 1PB of data to a new site, and also keep up with
>>>> new files that thrown into cluster all the time.
>>>> Happy friday !!
>>>> P

View raw message