hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Praveen Sripati <praveensrip...@gmail.com>
Subject DataNode not able to talk to NameNode
Date Fri, 02 Apr 2010 01:59:34 GMT

I am trying to setup Hadoop on a two node cluster, both using Ubuntu 9.10. I
have configured one node as NameNode/JobTracker and the other as

I have the following in the hosts file for the master and the slave

master> cat /etc/hosts master slave localhost

slave> cat /etc/hosts master slave localhosts

and the configuration file on the master and the slave has

master     -> core-site.xml -> fs.default.name->hdfs://localhost:9050
    -> hdfs-site.xml -> dfs.replication->1
    -> mapred-site.xml -> mapred.job.tracker->localhost:9001

slave     -> core-site.xml -> fs.default.name->hdfs://master:9050
    -> hdfs-site.xml -> dfs.replication->1
    -> mapred-site.xml -> mapred.job.tracker->master:9001

When I run the command start-dfs.sh, the NameNode starts without any errors
and the script tries to start the DataNode. But, the DataNode is not able to
connect to the MasterNode. The following is in the
hadoop-praveensripati-datanode-slave.log file

2010-04-02 06:54:35,630 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: master/ Already tried 9 time(s).
2010-04-02 06:54:35,645 INFO org.apache.hadoop.ipc.RPC: Server at master/ not available yet, Zzzzz...

1. Able to ping the master from the slave and the other way.
2. Able to ssh into slave from master and other way.
3. Disabled ipv6 on master and slave. /etc/sysctl.conf has
net.ipv6.conf.all.disable_ipv6 = 1.

I wrote a Java SocketClient Program to connect from the DataNode to the
NameNode at port 9050 and I get the following exception

java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
    at java.net.Socket.connect(Socket.java:525)
    at SocketClient.main(SocketClient.java:23)

Then, I stop the NameNode and DataNode and then by using Java Programs I
create a socket (at 9050) on the NameNode and am able to connect from the
DataNode using Java Program.

ServerSocket.java has

    int port = Integer.parseInt(args[0]);
    ServerSocket srv = new ServerSocket(port);
    Socket socket = srv.accept();

SocketClient.java has

    InetAddress addr = InetAddress.getByName(args[0]);
    int port = Integer.parseInt(args[1]);
    SocketAddress sockaddr = new InetSocketAddress(addr, port);
    Socket sock = new Socket();
    int timeoutMs = 2000;
    sock.connect(sockaddr, timeoutMs);

When I do 'netstat -a | grep 9050' I get

When NameNode creates the Socket     -> tcp        0      0
localhost:9050          *:*                     LISTEN
When Java Program creates a Socket     -> tcp        0      0
*:9050                  *:*                     LISTEN

Why is that the DataNode not able to Connect at port 9050 on the NameNode,
while the SocketClient.java connects to SocketServer.java on port 9050? Is
there anything different that the NameNode creates a socket?

View raw message