hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mauro Cohen <mauroco...@gmail.com>
Subject Problem With NAT ips
Date Wed, 10 Apr 2013 21:32:13 GMT
Hello, i have a problem with the new version of hadoop.

I have cluster with 2 nodes.
Each one has a private ip and a public IP configured through NAT.
The problem is that the private IP of each node doesnt belong to the same
net. (I have no conectivity between nodes through that ip)
I have conectvity between nodes thorugh the NAT ip only, (ssh, ping, etc ).

With the hadoop 0.20.x version when i configured datanodes and namenodes
configuration files i allways used the host-name for propertys (ex:
fs.defaul.name property)  and never have problems with this.
But with the new version of hadoop, theres has to be change the way that
nodes comunicates itself, and they use the private ip in some point instead
of host-names.

I have installed a cluster with 2 nodes:

hadoop-2-00 is the namenode.
In hadoop-2-00 i have this /etc/hosts file and this ifconfig output:

*etc/hosts:*

172.16.67.68 hadoop-2-00

*ifconfig*:

eth0      Link encap:Ethernet  HWaddr fa:16:3e:4c:06:25
          inet addr:172.16.67.68  Bcast:172.16.95.255  Mask:255.255.224.0
          inet6 addr: fe80::f816:3eff:fe4c:625/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:73475 errors:0 dropped:0 overruns:0 frame:0
          TX packets:58912 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:100923399 (100.9 MB)  TX bytes:101169918 (101.1 MB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:10 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:588 (588.0 B)  TX bytes:588 (588.0 B)

The NAT ip for this node is 10.70.5.51

I use the host-name(*hadoop-2-00*) in all the configuration files of hadoop.

The other node is the datanode* hadoop-2-01* and has this etc/hosts and
ifconfig output:

eth0      Link encap:Ethernet  HWaddr fa:16:3e:70:5e:bd
          inet addr:172.16.67.69  Bcast:172.16.95.255  Mask:255.255.224.0
          inet6 addr: fe80::f816:3eff:fe70:5ebd/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:27081 errors:0 dropped:0 overruns:0 frame:0
          TX packets:24105 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:95842550 (95.8 MB)  TX bytes:4314694 (4.3 MB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:34 errors:0 dropped:0 overruns:0 frame:0
          TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1900 (1.9 KB)  TX bytes:1900 (1.9 KB)

*/etc/hosts*

172.16.67.69 hadoop-2-01

The nat ip for that host is 10.70.5.57

When i start the namenode there  is no problem.

But when i start the datanode i theres is an error.

This is the stacktrace of the datanode log:

2013-04-10 16:01:26,997 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool
BP-2054036249-172.16.67.68-1365621320283 (storage id
DS-1556234100-172.16.67.69-50010-1365621786288) service to hadoop-2-00/
10.70.5.51:8020 beginning handshake with NN
2013-04-10 16:01:27,013 FATAL
org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
block pool Block pool BP-2054036249-172.16.67.68-1365621320283 (storage id
DS-1556234100-172.16.67.69-50010-1365621786288) service to hadoop-2-00/
10.70.5.51:8020
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException):
Datanode denied communication with namenode: DatanodeRegistration(0.0.0.0,
storageID=DS-1556234100-172.16.67.69-50010-1365621786288, infoPort=50075,
ipcPort=50020,
storageInfo=lv=-40;cid=CID-65f42cc4-6c02-4537-9fb8-627a612ec74e;nsid=1995699852;c=0)
        at
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:629)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3459)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:881)
        at
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:90)
        at
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:18295)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729)

        at org.apache.hadoop.ipc.Client.call(Client.java:1235)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at $Proxy10.registerDatanode(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at $Proxy10.registerDatanode(Unknown Source)
        at
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.registerDatanode(DatanodeProtocolClientSideTranslatorPB.java:146)
        at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.register(BPServiceActor.java:623)
        at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
        at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
        at java.lang.Thread.run(Thread.java:662)
2013-04-10 16:01:27,015 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service
for: Block pool BP-2054036249-172.16.67.68-1365621320283 (storage id
DS-1556234100-172.16.67.69-50010-1365621786288) service to hadoop-2-00/
10.70.5.51:8020
2013-04-10 16:01:27,016 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool
BP-2054036249-172.16.67.68-1365621320283 (storage id
DS-1556234100-172.16.67.69-50010-1365621786288)
2013-04-10 16:01:27,016 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Removed
bpid=BP-2054036249-172.16.67.68-1365621320283 from blockPoolScannerMap
2013-04-10 16:01:27,016 INFO
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:
Removing block pool BP-2054036249-172.16.67.68-1365621320283
2013-04-10 16:01:29,017 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2013-04-10 16:01:29,019 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 0
2013-04-10 16:01:29,021 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at hadoop-2-01/172.16.67.69



Do you know if theres a way to solve this?

Any ideas?

Thanks.
Mauro.

Mime
View raw message