hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HADOOP-12919) MiniDFSCluster uses wrong IP address
Date Fri, 11 Mar 2016 22:27:02 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-12919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Christopher Tubbs resolved HADOOP-12919.
----------------------------------------
    Resolution: Duplicate

Sorry about the duplicate. JIRA is so slow right now, I didn't realize the previous submit
made it through.

> MiniDFSCluster uses wrong IP address
> ------------------------------------
>
>                 Key: HADOOP-12919
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12919
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 2.2.0, 2.6.1, 2.6.3
>            Reporter: Christopher Tubbs
>
> MiniDFSCluster seems to be registering the DataNode using the machine's internal IP address,
rather than "localhost/127.0.0.1". It looks like the problem isn't MiniDFSCluster specific,
but that's what's biting me right now and I can't figure out a workaround.
> MiniDFSCluster logs show roughly the following (jetty services ignored):
> NameNode starts org.apache.hadoop.ipc.Server listening on localhost/127.0.0.1:43023
> DataNode reports "Configured hostname is 127.0.0.1"
> DataNode reports "Opened streaming server at /127.0.0.1:57310"
> DataNode starts org.apache.hadoop.ipc.Server listening on localhost/127.0.0.1:53015
> DataNode registers with NN using storage id DS-XXXXXXXXX-172.31.3.214-57310-XXXXXXXXXXXXX
with ipcPort=53015
> NameNode reports "Adding a new node: /default-rack/172.31.3.214:57310"
> The storage id should have been derived from 127.0.0.1, and the so should all the other
registered information.
> I've verified with netstat that all services were listening only on 127.0.0.1
> This resulted in the client being unable to write blocks to the datanode, because it
was not listening on the address given to it by the namenode (the address it was registered
under).
> The actual client error message is:
> {code:java}
> [IPC Server handler 0 on 43023} INFO  org.apache.hadoop.hdfs.StateChange  - BLOCK* allocateBlock:
/test-dir/HelloWorld.jar. BP-460569874-172.31.3.214-1457727894640 blk_1073741825_1001{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.31.3.214:57310|RBW]]}
> [Thread-61} INFO  org.apache.hadoop.hdfs.DFSClient  - Exception in createBlockOutputStream
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>   at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1305)
>   at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1128)
>   at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1088)
>   at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> [Thread-61} INFO  org.apache.hadoop.hdfs.DFSClient  - Abandoning BP-460569874-172.31.3.214-1457727894640:blk_1073741825_1001
> [Thread-61} INFO  org.apache.hadoop.hdfs.DFSClient  - Excluding datanode 172.31.3.214:57310
> [IPC Server handler 2 on 43023} WARN  org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
 - Not able to place enough replicas, still in need of 1 to reach 1
> For more information, please enable DEBUG log level on org.apache.commons.logging.impl.Log4JLogger
> [IPC Server handler 2 on 43023} ERROR org.apache.hadoop.security.UserGroupInformation
 - PriviledgedActionException as:christopher (auth:SIMPLE) cause:java.io.IOException: File
/test-dir/HelloWorld.jar could only be replicated to 0 nodes instead of minReplication (=1).
 There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
> [IPC Server handler 2 on 43023} INFO  org.apache.hadoop.ipc.Server  - IPC Server handler
2 on 43023, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.31.3.214:57395
Call#12 Retry#0: error: java.io.IOException: File /test-dir/HelloWorld.jar could only be replicated
to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s)
are excluded in this operation.
> java.io.IOException: File /test-dir/HelloWorld.jar could only be replicated to 0 nodes
instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded
in this operation.
>   at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
>   at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>   at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>   at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
> [Thread-61} WARN  org.apache.hadoop.hdfs.DFSClient  - DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test-dir/HelloWorld.jar
could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s)
running and 1 node(s) are excluded in this operation.
>   at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
>   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
>   at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
>   at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
>   at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>   at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy17.addBlock(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy17.addBlock(Unknown Source)
>   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
>   at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
>   at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
>   at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
> {code}
> Additional information:
> I've tried with Hadoop 2.2.0, 2.6.1, and 2.6.3 and same results. It probably affects
other versions.
> I do not see this problem running locally, only in EC2, but I've yet to be able to find
a relevant networking configuration difference which would have any effect. (no extra entries
in /etc/hosts, no DNS issues, etc.)
> I can reproduce this easily by trying to build Accumulo's master branch (HEAD at db21315)
with `mvn clean package -Dtest=VfsClassLoaderTest -DfailIfNoTests=false -Dhadoop.version=2.6.3`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message