hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Austin Chungath <austi...@gmail.com>
Subject Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3
Date Thu, 03 May 2012 10:25:00 GMT
Yes. This was first posted on the cloudera mailing list. There were no
responses.

But this is not related to cloudera as such.

cdh3 is based on apache hadoop 0.20 as the base. My data is in apache
hadoop 0.20.205

There is an upgrade namenode option when we are migrating to a higher
version say from 0.20 to 0.20.205
but here I am downgrading from 0.20.205 to 0.20 (cdh3)
Is this possible?


On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi <prash1784@gmail.com>wrote:

> Seems like a matter of upgrade. I am not a Cloudera user so would not know
> much, but you might find some help moving this to Cloudera mailing list.
>
> On Thu, May 3, 2012 at 2:51 AM, Austin Chungath <austincv@gmail.com>
> wrote:
>
> > There is only one cluster. I am not copying between clusters.
> >
> > Say I have a cluster running apache 0.20.205 with 10 TB storage capacity
> > and has about 8 TB of data.
> > Now how can I migrate the same cluster to use cdh3 and use that same 8 TB
> > of data.
> >
> > I can't copy 8 TB of data using distcp because I have only 2 TB of free
> > space
> >
> >
> > On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar <nitinpawar432@gmail.com>
> > wrote:
> >
> > > you can actually look at the distcp
> > >
> > > http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
> > >
> > > but this means that you have two different set of clusters available to
> > do
> > > the migration
> > >
> > > On Thu, May 3, 2012 at 12:51 PM, Austin Chungath <austincv@gmail.com>
> > > wrote:
> > >
> > > > Thanks for the suggestions,
> > > > My concerns are that I can't actually copyToLocal from the dfs
> because
> > > the
> > > > data is huge.
> > > >
> > > > Say if my hadoop was 0.20 and I am upgrading to 0.20.205 I can do a
> > > > namenode upgrade. I don't have to copy data out of dfs.
> > > >
> > > > But here I am having Apache hadoop 0.20.205 and I want to use CDH3
> now,
> > > > which is based on 0.20
> > > > Now it is actually a downgrade as 0.20.205's namenode info has to be
> > used
> > > > by 0.20's namenode.
> > > >
> > > > Any idea how I can achieve what I am trying to do?
> > > >
> > > > Thanks.
> > > >
> > > > On Thu, May 3, 2012 at 12:23 PM, Nitin Pawar <
> nitinpawar432@gmail.com
> > > > >wrote:
> > > >
> > > > > i can think of following options
> > > > >
> > > > > 1) write a simple get and put code which gets the data from DFS and
> > > loads
> > > > > it in dfs
> > > > > 2) see if the distcp  between both versions are compatible
> > > > > 3) this is what I had done (and my data was hardly few hundred GB)
> ..
> > > > did a
> > > > > dfs -copyToLocal and then in the new grid did a copyFromLocal
> > > > >
> > > > > On Thu, May 3, 2012 at 11:41 AM, Austin Chungath <
> austincv@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > > I am migrating from Apache hadoop 0.20.205 to CDH3u3.
> > > > > > I don't want to lose the data that is in the HDFS of Apache
> hadoop
> > > > > > 0.20.205.
> > > > > > How do I migrate to CDH3u3 but keep the data that I have on
> > 0.20.205.
> > > > > > What is the best practice/ techniques to do this?
> > > > > >
> > > > > > Thanks & Regards,
> > > > > > Austin
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Nitin Pawar
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Nitin Pawar
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message