hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shady Xu <shad...@gmail.com>
Subject Re: How to distcp data between two clusters which are not in the same local network?
Date Wed, 24 Aug 2016 09:17:33 GMT
Anyone any idea?

2016-08-16 10:27 GMT+08:00 Shady Xu <shadyxu@gmail.com>:

> Thanks Wei-Chiu and Sunil, I have read the docs you mentioned before
> starting. The specific problem now is that the DataNodes of the source
> cluster report their local ip instead of the public one, which cannot be
> accessed from the NodeManagers of the destination cluster. Seems the
> solution is to set the `dfs.datanode.dns.interface` property but
> unfortunately it doesn't work.
> 2016-08-15 22:06 GMT+08:00 Sunil Govind <sunil.govind@gmail.com>:
>> Hi
>> I think you can also refer below link too.
>> http://aajisaka.github.io/hadoop-project/hadoop-distcp/DistCp.html
>> Thanks
>> Sunil
>> On Mon, Aug 15, 2016 at 7:26 PM Wei-Chiu Chuang <weichiu@apache.org>
>> wrote:
>>> Hello,
>>> if I understand your question correctly, you are actually building a
>>> multi-home Hadoop, correct?
>>> Multi-homed Hadoop cluster can be tricky to set up, to the extend that
>>> Cloudera does not recommend it. I've not set up a multihome Hadoop cluster
>>> before, but I think you have to make sure the reverse resolution works for
>>> the IP addresses.
>>> https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/
>>> hadoop-hdfs/HdfsMultihoming.html
>>> On Mon, Aug 15, 2016 at 1:06 AM, Shady Xu <shadyxu@gmail.com> wrote:
>>>> Hi all,
>>>> Recently I tried to use distcp to copy data across two clusters which
>>>> are not in the same local network. Fortunately, the nodes of the source
>>>> cluster each has an extra interface and ip which can be accessed from the
>>>> destination cluster. But during the process of distcp, the map tasks always
>>>> used the local ip of the source cluster nodes which they cannot reach.
>>>> I tried changing the property 'dfs.datanode.dns.interface' to the one I
>>>> want, and I tried changing the property 'dfs.datanode.use.datanode.hos
>>>> tname' to true too. Nothing works.
>>>> Does hadoop now support this or do I miss something?

View raw message