hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taeho Kang <tka...@gmail.com>
Subject Re: Transferring data between different Hadoop clusters
Date Tue, 03 Feb 2009 01:33:13 GMT
Thanks for your prompt reply.

When using the command
"./bin/hadoop distcp hftp://cluster1:50070/path hdfs://cluster2/path"

- Should this command be given in cluster1?
- What does port "50070" specify? Is it the one in "fs.default.name", or
"dfs.http.address"?

/Taeho



On Mon, Feb 2, 2009 at 12:40 PM, Mark Chadwick <mchadwick@invitemedia.com>wrote:

> Taeho,
>
> The distcp command is perfect for this.  If you're copying between two
> clusters running the same version of Hadoop, you can do something like:
>
> ./bin/hadoop distcp hdfs://cluster1/path hdfs://cluster2/path
>
> If you're copying between 0.18 and 0.19, the command will look like:
>
> ./bin/hadoop distcp hftp://cluster1:50070/path hdfs://cluster2/path
>
> Hope that helps,
> -Mark
>
> On Sun, Feb 1, 2009 at 9:48 PM, Taeho Kang <tkang1@gmail.com> wrote:
>
> > Dear all,
> >
> > There have been times where I needed to transfer some big data from one
> > version of Hadoop cluster to another.
> > (e.g. from hadoop 0.18 to hadoop 0.19 cluster)
> >
> > Other than copying files from one cluster to a local file system and
> upload
> > it to another,
> > is there a tool that does it?
> >
> > Thanks in advance,
> > Regards,
> >
> > /Taeho
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message