hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ronan Lehane <ronan.leh...@gmail.com>
Subject Data Nodes not seeing NameNode / Task Trackers not seeing JobTracker
Date Mon, 16 Jul 2012 18:35:03 GMT
Hi All,

I was wondering if anyone could help me figure out what's going wrong in my
five node Hadoop cluster, please?

It consists of:
1. NameNode
hduser@namenode:/usr/local/hadoop$ jps
13049 DataNode
13387 Jps
12740 NameNode
13316 SecondaryNameNode

2. JobTracker
hduser@jobtracker:/usr/local/hadoop$ jps
21817 TaskTracker
21448 DataNode
21542 JobTracker
21862 Jps

3. Slave1
hduser@slave1:/usr/local/hadoop$ jps
21226 DataNode
21514 Jps
21463 TaskTracker

4. Slave2
hduser@slave2:/usr/local/hadoop$ jps
20938 Jps
20650 DataNode
20887 TaskTracker

5. Slave3
hduser@slave3:/usr/local/hadoop$ jps
22145 Jps
21854 DataNode
22091 TaskTracker

All DataNodes have been kicked off by running start-dfs.sh on the NameNode
All TaskTrackers have been kicked off by running start-mapred.sh on the

When I try to execute a simple wordcount job from the NameNode I receive
the following error:
12/07/16 19:25:22 ERROR security.UserGroupInformation:
PriviledgedActionException as:hduser cause:java.net.ConnectException: Call
to jobtracker/ failed on connection exception:
java.net.ConnectException: Connection refused

If I check the jobtracker:
1. I can ping in both directions by both IP and Hostname
2. I can see that the jobtracker is listening on port 54311
tcp        0      0*
LISTEN      1001       425093      21542/java
3. Telnet to this port from the NameNode fails with "Connection Refused"
telnet: Unable to connect to remote host: Connection refused

This issue can be worked around by moving the JobTracker functionality to
the NameNode, but when this is done the job is executed on the NameNode
rather than distributed across the cluster.
Checking the log files on the slaves nodes, I see Server Not Available
messages referenced at the below wiki.
The Data Nodes not seeing the NameNode and the Task Trackers not seeing
Checking the JobTracker web interface, it always states there is only 1
node available.

I've checked the 5 troubleshooting steps provided but it all looks to be ok
in my environment.

Would anyone have any idea's of what could be causing this?
Any help would be appreciated.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message