hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhu weimin" <xim-...@tsm.kddilabs.jp>
Subject RE: DataNode not able to talk to NameNode
Date Fri, 02 Apr 2010 05:53:17 GMT
Hi

 

hadoop  is almost java program .

 

Namenode using ServerSocket method with bindAddr parameter.

The source code look like the following for your setting

 

ServerSocket socket = new ServerSocket(9050);

socket.bind(localhost);

 

because bindAddr is assigned to localhost ,then namenode process only can
accept connection from localhost.

if bindAddr is assigned to the name of machine name or ip address ,then
namenode process can accept any connection from remote machine.

 

 

 

From: Praveen Sripati [mailto:praveensripati@gmail.com] 
Sent: Friday, April 02, 2010 1:50 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: DataNode not able to talk to NameNode

 


Hi,

Thanks for the quick response. Wish the documentation was a bit more clear
on it.

Now it works. I get "Live Datanodes : 1" in the NameNode console.

>> modify  fs.default.name <http://fs.default.name/>  from
hdfs:/localhost:9050  to hdfs://master:9050

Just curious, how does it impact the Socket on the NameNode?

Praveen

On Fri, Apr 2, 2010 at 7:49 AM, zhu weimin <xim-shu@tsm.kddilabs.jp> wrote:

Hi

 

>Why is that the DataNode not able to Connect at port 9050 on the NameNode,
while the SocketClient.java connects to ?>SocketServer.java on port 9050? Is
there anything different that the NameNode creates a socket?

 

modify  fs.default.name from  hdfs:/localhost:9050  to hdfs://master:9050

 and modify mapred.job.tracker from localhost:9001 to master:9001

 

 

zhuweimin

 

 

From: Praveen Sripati [mailto:praveensripati@gmail.com] 
Sent: Friday, April 02, 2010 11:00 AM
To: hdfs-user@hadoop.apache.org
Subject: DataNode not able to talk to NameNode

 

Hi,

I am trying to setup Hadoop on a two node cluster, both using Ubuntu 9.10. I
have configured one node as NameNode/JobTracker and the other as
DataNode/TaskTracker.

I have the following in the hosts file for the master and the slave

master> cat /etc/hosts
192.168.0.100 master
192.168.0.102 slave
127.0.0.1 localhost

slave> cat /etc/hosts
192.168.0.100 master
192.168.0.102 slave
127.0.0.1 localhosts

and the configuration file on the master and the slave has

master     -> core-site.xml -> fs.default.name->hdfs://localhost:9050
    -> hdfs-site.xml -> dfs.replication->1
    -> mapred-site.xml -> mapred.job.tracker->localhost:9001

slave     -> core-site.xml -> fs.default.name->hdfs://master:9050
    -> hdfs-site.xml -> dfs.replication->1
    -> mapred-site.xml -> mapred.job.tracker->master:9001


When I run the command start-dfs.sh, the NameNode starts without any errors
and the script tries to start the DataNode. But, the DataNode is not able to
connect to the MasterNode. The following is in the
hadoop-praveensripati-datanode-slave.log file

2010-04-02 06:54:35,630 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: master/192.168.0.100:9050. Already tried 9 time(s).
2010-04-02 06:54:35,645 INFO org.apache.hadoop.ipc.RPC: Server at
master/192.168.0.100:9050 not available yet, Zzzzz...

1. Able to ping the master from the slave and the other way.
2. Able to ssh into slave from master and other way.
3. Disabled ipv6 on master and slave. /etc/sysctl.conf has
net.ipv6.conf.all.disable_ipv6 = 1.

I wrote a Java SocketClient Program to connect from the DataNode to the
NameNode at port 9050 and I get the following exception

java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
    at java.net.Socket.connect(Socket.java:525)
    at SocketClient.main(SocketClient.java:23)

Then, I stop the NameNode and DataNode and then by using Java Programs I
create a socket (at 9050) on the NameNode and am able to connect from the
DataNode using Java Program.

ServerSocket.java has

    int port = Integer.parseInt(args[0]);
    ServerSocket srv = new ServerSocket(port);
    Socket socket = srv.accept();

SocketClient.java has

    InetAddress addr = InetAddress.getByName(args[0]);
    int port = Integer.parseInt(args[1]);
    SocketAddress sockaddr = new InetSocketAddress(addr, port);
    Socket sock = new Socket();
    int timeoutMs = 2000;
    sock.connect(sockaddr, timeoutMs);

When I do 'netstat -a | grep 9050' I get

When NameNode creates the Socket     -> tcp        0      0 localhost:9050
*:*                     LISTEN
When Java Program creates a Socket     -> tcp        0      0 *:9050
*:*                     LISTEN

Why is that the DataNode not able to Connect at port 9050 on the NameNode,
while the SocketClient.java connects to SocketServer.java on port 9050? Is
there anything different that the NameNode creates a socket?
-- 
Praveen




-- 
Praveen


Mime
View raw message