hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mauro Cohen <mauroco...@gmail.com>
Subject Re: Problem With NAT ips
Date Thu, 11 Apr 2013 15:10:25 GMT
Thank you Daryn for your response.

I try what you tell me, and now the datanode is working. But now there is
another problem.

When you get to the name node live nodes page i can see mi data node as
alive. But when i try to enter to the datanode page i have this message as
a responde:

No Route to Host from hadoop-2-01/172.16.67.69 to 172.16.67.68:8020 failed
on socket timeout exception: java.net.NoRouteToHostException: No route to
host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
It seems that in some point it still passing the private ip to comunicate
between the nodes.

When i look into the url of the link it pass the private ip of the namenode
as the nnaddress param:

*
http://hadoop-2-01:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F&nnaddr=
172.16.67.68:8020*

If i put that param with the namenode hostname or with the public ip of the
namenode it works fine.

But when i run any job that looks for information in the datanode, it is
using the private ip to comunicate , so i get the typical msg of "could not
obtain block".

Any ideas?.


Thanks.
Mauro.

2013/4/11 Daryn Sharp <daryn@yahoo-inc.com>

>  Hi Mauro,
>
>  The registration process has changed quite a bit.  I don't think the NN
> "trusts" the DN's self-identification anymore.  Otherwise it makes it
> trivial to spoof another DN, intentionally or not, which can be a security
> hazard.
>
>  I suspect the NN can't resolve the DN.  Unresolvable hosts are rejected
> because the allow/deny lists may contain hostnames.  If dns is temporarily
> unavailable, you don't want a node blocked by hostname to slip through.
>    Try adding the DN's public ip 10.70.5.57 to the NN's /etc/hosts if it's
> not resolvable via dns.
>
>  I hope this helps!
>
>  Daryn
>
>   On Apr 10, 2013, at 4:32 PM, Mauro Cohen wrote:
>
>
>
> Hello, i have a problem with the new version of hadoop.
>
>  I have cluster with 2 nodes.
> Each one has a private ip and a public IP configured through NAT.
> The problem is that the private IP of each node doesnt belong to the same
> net. (I have no conectivity between nodes through that ip)
> I have conectvity between nodes thorugh the NAT ip only, (ssh, ping, etc
> ).
>
>  With the hadoop 0.20.x version when i configured datanodes and namenodes
> configuration files i allways used the host-name for propertys (ex:
> fs.defaul.name property)  and never have problems with this.
> But with the new version of hadoop, theres has to be change the way that
> nodes comunicates itself, and they use the private ip in some point instead
> of host-names.
>
>  I have installed a cluster with 2 nodes:
>
>  hadoop-2-00 is the namenode.
> In hadoop-2-00 i have this /etc/hosts file and this ifconfig output:
>
>  *etc/hosts:*
>
>  172.16.67.68 hadoop-2-00
>
>  *ifconfig*:
>
>  eth0      Link encap:Ethernet  HWaddr fa:16:3e:4c:06:25
>           inet addr:172.16.67.68  Bcast:172.16.95.255  Mask:255.255.224.0
>           inet6 addr: fe80::f816:3eff:fe4c:625/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:73475 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:58912 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:100923399 (100.9 MB)  TX bytes:101169918 (101.1 MB)
>
>  lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:10 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:588 (588.0 B)  TX bytes:588 (588.0 B)
>
>  The NAT ip for this node is 10.70.5.51
>
>  I use the host-name(*hadoop-2-00*) in all the configuration files of
> hadoop.
>
>  The other node is the datanode* hadoop-2-01* and has this etc/hosts and
> ifconfig output:
>
>  eth0      Link encap:Ethernet  HWaddr fa:16:3e:70:5e:bd
>           inet addr:172.16.67.69  Bcast:172.16.95.255  Mask:255.255.224.0
>           inet6 addr: fe80::f816:3eff:fe70:5ebd/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:27081 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:24105 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:95842550 (95.8 MB)  TX bytes:4314694 (4.3 MB)
>
>  lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:34 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:1900 (1.9 KB)  TX bytes:1900 (1.9 KB)
>
>  */etc/hosts*
>
>  172.16.67.69 hadoop-2-01
>
>  The nat ip for that host is 10.70.5.57
>
>  When i start the namenode there  is no problem.
>
>  But when i start the datanode i theres is an error.
>
>  This is the stacktrace of the datanode log:
>
>  2013-04-10 16:01:26,997 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool
> BP-2054036249-172.16.67.68-1365621320283 (storage id
> DS-1556234100-172.16.67.69-50010-1365621786288) service to hadoop-2-00/
> 10.70.5.51:8020 beginning handshake with NN
> 2013-04-10 16:01:27,013 FATAL
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
> block pool Block pool BP-2054036249-172.16.67.68-1365621320283 (storage id
> DS-1556234100-172.16.67.69-50010-1365621786288) service to hadoop-2-00/
> 10.70.5.51:8020
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException):
> Datanode denied communication with namenode: DatanodeRegistration(0.0.0.0,
> storageID=DS-1556234100-172.16.67.69-50010-1365621786288, infoPort=50075,
> ipcPort=50020,
> storageInfo=lv=-40;cid=CID-65f42cc4-6c02-4537-9fb8-627a612ec74e;nsid=1995699852;c=0)
>         at
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:629)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3459)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:881)
>         at
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:90)
>         at
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:18295)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729)
>
>          at org.apache.hadoop.ipc.Client.call(Client.java:1235)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>         at $Proxy10.registerDatanode(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>         at $Proxy10.registerDatanode(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.registerDatanode(DatanodeProtocolClientSideTranslatorPB.java:146)
>         at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.register(BPServiceActor.java:623)
>         at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
>         at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
>         at java.lang.Thread.run(Thread.java:662)
> 2013-04-10 16:01:27,015 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service
> for: Block pool BP-2054036249-172.16.67.68-1365621320283 (storage id
> DS-1556234100-172.16.67.69-50010-1365621786288) service to hadoop-2-00/
> 10.70.5.51:8020
> 2013-04-10 16:01:27,016 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool
> BP-2054036249-172.16.67.68-1365621320283 (storage id
> DS-1556234100-172.16.67.69-50010-1365621786288)
> 2013-04-10 16:01:27,016 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Removed
> bpid=BP-2054036249-172.16.67.68-1365621320283 from blockPoolScannerMap
> 2013-04-10 16:01:27,016 INFO
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:
> Removing block pool BP-2054036249-172.16.67.68-1365621320283
> 2013-04-10 16:01:29,017 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
> 2013-04-10 16:01:29,019 INFO org.apache.hadoop.util.ExitUtil: Exiting with
> status 0
> 2013-04-10 16:01:29,021 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at hadoop-2-01/172.16.67.69
>
>
>
>  Do you know if theres a way to solve this?
>
>  Any ideas?
>
>  Thanks.
>  Mauro.
>
>
>
>
>
>

Mime
View raw message