hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunil Govind <sunil.gov...@gmail.com>
Subject Re: How to distcp data between two clusters which are not in the same local network?
Date Mon, 15 Aug 2016 14:06:52 GMT
Hi

I think you can also refer below link too.
http://aajisaka.github.io/hadoop-project/hadoop-distcp/DistCp.html

Thanks
Sunil

On Mon, Aug 15, 2016 at 7:26 PM Wei-Chiu Chuang <weichiu@apache.org> wrote:

> Hello,
> if I understand your question correctly, you are actually building a
> multi-home Hadoop, correct?
> Multi-homed Hadoop cluster can be tricky to set up, to the extend that
> Cloudera does not recommend it. I've not set up a multihome Hadoop cluster
> before, but I think you have to make sure the reverse resolution works for
> the IP addresses.
>
>
> https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html
>
>
> On Mon, Aug 15, 2016 at 1:06 AM, Shady Xu <shadyxu@gmail.com> wrote:
>
>> Hi all,
>>
>> Recently I tried to use distcp to copy data across two clusters which are
>> not in the same local network. Fortunately, the nodes of the source cluster
>> each has an extra interface and ip which can be accessed from the
>> destination cluster. But during the process of distcp, the map tasks always
>> used the local ip of the source cluster nodes which they cannot reach.
>>
>> I tried changing the property 'dfs.datanode.dns.interface' to the one I
>> want, and I tried changing the property '
>> dfs.datanode.use.datanode.hostname' to true too. Nothing works.
>>
>> Does hadoop now support this or do I miss something?
>>
>
>

Mime
View raw message