Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9A5F9D3EA for ; Sat, 4 Aug 2012 12:33:25 +0000 (UTC) Received: (qmail 85218 invoked by uid 500); 4 Aug 2012 12:33:21 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 84953 invoked by uid 500); 4 Aug 2012 12:33:21 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 84941 invoked by uid 99); 4 Aug 2012 12:33:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Aug 2012 12:33:20 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bumble.dev@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qa0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Aug 2012 12:33:12 +0000 Received: by qadz32 with SMTP id z32so190574qad.14 for ; Sat, 04 Aug 2012 05:32:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=NHjRMVYCoXs04txhcyI1XHOp3pcLFwaDDApdhN+Fy8U=; b=ryTeusn04FXp8/VdUPoM5LY+ghJ3zJetzeWwO8o7b2vlhZdO5RczaBxLR4YgOgu9aZ eDhEuBwuL9SEXM3FTlZIg4IPRs63L0kyuTJzUfa+XPlnxnX2mIRqwuO8xam1FzEnaobA JHYqYXze3KYlWSutb/VZNC7lRYdTIAUA4Dx7+Tt+pjWQUIsf3BkJA3Y+Pc7NvQFhWuNq eftef7NeWmS0fWYfASt//7L2t3BYi4Ms8oq2BSgGaOmrSXXmN77csRj0Zis+NeFIGhWg kBCRj6/3TWq59MPJnAgK3HxveBIRYf3quE4QUcG0jSZLSFUTexzSjRt8Hl1b+xi2b7DI ns8A== MIME-Version: 1.0 Received: by 10.224.189.17 with SMTP id dc17mr8020036qab.47.1344083571737; Sat, 04 Aug 2012 05:32:51 -0700 (PDT) Received: by 10.49.117.168 with HTTP; Sat, 4 Aug 2012 05:32:51 -0700 (PDT) In-Reply-To: References: <703A050B-C8E6-4FFB-9D8C-70C36397A88B@gmail.com> Date: Sat, 4 Aug 2012 18:02:51 +0530 Message-ID: Subject: Re: migrate cluster to different datacenter From: Nitin Kesarwani To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf30334687d783e004c66fd7b3 --20cf30334687d783e004c66fd7b3 Content-Type: text/plain; charset=ISO-8859-1 Given the size of data, there can be several approaches here: 1. Moving the boxes Not possible, as I suppose the data must be needed for 24x7 analytics. 2. Mirroring the data. This is a good solution. However, if you have data being written/removed continuously (if a part of live system), there are chances of losing some of the data during mirroring happens, unless a) You block writes/updates during that time (if you do so, that would be as good as unplugging and moving the machine around), or, b) Keep a track of what was modified since you started the mirroring process. I would recommend you to go with 2b) because it minimizes downtime. Here is how I think you can do it, by using some of the tools provided by Hadoop itself. a) You can use some fast distributed copying tool to copy large chunks of data. Before you kick-off with this, you can create a utility that tracks the modification of data made to your live system while copying is going on in the background. The utility will log the modifications into an audit trail. b) Once you're done copying the files, allow the new data store replication to catch up by reading the real-time modifications that were made, from your utility's log file. Once sync'ed up you can begin with the minimal downtime by switching off the JobTracker in live cluster so that new files are not created. c) As soon as you reach the last chunk of copying, change the DNS entries so that the hostnames referenced by the Hadoop jobs points to the new location. d) Turn on the JobTracker for the new cluster. e) Enjoy a drink with the money you saved by not using other paid third party solutions and pat your back! ;) The key of the above solution is to make data copying of step a) as fast as possible. Lesser the time, lesser the contents in audit trail, lesser the overall downtime. You can develop some in house solution for this, or use DistCp, provided by Hadoop that uses copies over the data using Map/Reduce. On Sat, Aug 4, 2012 at 3:27 AM, Michael Segel wrote: > Sorry at 1PB of disk... compression isn't going to really help a whole > heck of a lot. Your networking bandwidth will be your bottleneck. > > So lets look at the problem. > > How much down time can you afford? > What does your hardware look like? > How much space do you have in your current data center? > > You have 1PB of data. OK, what does the access pattern look like? > > There are a couple of ways to slice and dice this. How many trucks do you > have? > > On Aug 3, 2012, at 4:24 PM, Harit Himanshu > wrote: > > > Moving 1 PB of data would take loads of time, > > - check if this new data center provides something similar to > http://aws.amazon.com/importexport/ > > - Consider multi part uploading of data > > - consider compressing the data > > > > > > On Aug 3, 2012, at 2:19 PM, Patai Sangbutsarakum wrote: > > > >> thanks for response. > >> Physical move is not a choice in this case. Purely looking for copying > >> data and how to catch up with the update of a file while it is being > >> migrated. > >> > >> On Fri, Aug 3, 2012 at 12:40 PM, Chen He wrote: > >>> sometimes, physically moving hard drives helps. :) > >>> On Aug 3, 2012 1:50 PM, "Patai Sangbutsarakum" < > silvianhadoop@gmail.com> > >>> wrote: > >>> > >>>> Hi Hadoopers, > >>>> > >>>> We have a plan to migrate Hadoop cluster to a different datacenter > >>>> where we can triple the size of the cluster. > >>>> Currently, our 0.20.2 cluster have around 1PB of data. We use only > >>>> Java/Pig. > >>>> > >>>> I would like to get some input how we gonna handle with transferring > >>>> 1PB of data to a new site, and also keep up with > >>>> new files that thrown into cluster all the time. > >>>> > >>>> Happy friday !! > >>>> > >>>> P > >>>> > > > > --20cf30334687d783e004c66fd7b3--