Hi All, 
I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64]. Using the following doc:
                  1. http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial

I'm able to setup hadoop clusters with required configurations. I can see that all the required services on master and on slaves nodes are running as required[please see below JPS command output ]. The problem, I'm facing is that, the HDFS and Mapreduce daemons running on Master and can be accessed from Master only, and not from the slave machines. Note that, I've added these ports in the EC2 security group to open them. And I can browse the master machines UI from web browser, using: http://<machine ip>:50070/dfshealth.jsp


Now, the problem which I'm facing is , the HDFS as well the jobtracker both are accessible from the master machine[I'm using master as both Namenode and Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for these two are not accessible from other slave nodes. 

I did: netstat -puntl on master machine and got this: 

hadoop@nutchcluster1:~/hadoop$ netstat -puntl
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -               
tcp6       0      0 :::50020                :::*                    LISTEN      6224/java       
tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java       
tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN      6723/java       
tcp6       0      0 :::57065                :::*                    LISTEN      6040/java       
tcp6       0      0 :::50090                :::*                    LISTEN      6401/java       
tcp6       0      0 :::50060                :::*                    LISTEN      6723/java       
tcp6       0      0 :::50030                :::*                    LISTEN      6540/java       
tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java       
tcp6       0      0 :::45747                :::*                    LISTEN      6401/java       
tcp6       0      0 :::33174                :::*                    LISTEN      6540/java       
tcp6       0      0 :::50070                :::*                    LISTEN      6040/java       
tcp6       0      0 :::22                   :::*                    LISTEN      -               
tcp6       0      0 :::54424                :::*                    LISTEN      6224/java       
tcp6       0      0 :::50010                :::*                    LISTEN      6224/java       
tcp6       0      0 :::50075                :::*                    LISTEN      6224/java       
udp        0      0 0.0.0.0:68              0.0.0.0:*                           -               
hadoop@nutchcluster1:~/hadoop$ 


As can be seen in the output, both the HDFS daemon and mapreduce daemons are accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave machines]
tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java 
tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java 


To confirm, the same Idid this on master: 
adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
hadoop@nutchcluster1:~/hadoop$ 

But, when I ran the same command on slaves, I get this: 
hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
Bad connection to FS. command aborted. exception: Call to nutchcluster1/10.4.39.23:54310 failed on connection exception: java.net.ConnectException: Connection refused
hadoop@nutchcluster2:~/hadoop$ 



The configurations are as below:

--------------core-site.xml content is as below
<property>
  <name>hadoop.tmp.dir</name>
  <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
  <description>A base for other temporary directories</description>
</property>

<property>
  <name>fs.default.name</name>
  <value>hdfs://nutchcluster1:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>


-----------------hdfs-site.xml content is as below:
<configuration>
<property>
  <name>dfs.replication</name>
  <value>2</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>
</configuration>



------------------mapred-site.xml content is as: 
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>nutchcluster1:54320</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>
<property>
  <name>mapred.map.tasks</name>
  <value>40</value>
  <description>As a rule of thumb, use 10x the number of slaves (i.e., number of tasktrackers).
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>40</value>
  <description>As a rule of thumb, use 2x the number of slave processors (i.e., number of tasktrackers).
  </description>
</property>
</configuration>


I replicated all the above on all the other 3 slave machines[1 master + 3 slaves]. My /etc/hosts content is as below on the master node. Note that, I've thee same content on slaves as well, the only difference is, its own IP is set to 127.0.0.1 and for others the exact IP is set:

------------------------------/etc/hosts conent:
127.0.0.1 localhost
127.0.0.1 nutchcluster1
10.111.59.96 nutchcluster2
10.201.223.79 nutchcluster3
10.190.117.68 nutchcluster4

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

-----------------/etc/hosts content ends here

File content for masters is :
nutchcluster1

and file content for slaves is: 
nutchcluster1
nutchcluster2
nutchcluster3
nutchcluster4

Then, I copied all the contents relevant to config[*-site.xml, *.env files] folder on all the slaves.

As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then I'm starting the mapreduce as : bin/start-mapred.sh 
After running the above two on my master machine[nuthccluster1] I can see the following jps output: 
adoop@nutchcluster1:~/hadoop$ jps
6401 SecondaryNameNode
6723 TaskTracker
6224 DataNode
6540 JobTracker
7354 Jps
6040 NameNode
hadoop@nutchcluster1:~/hadoop$ 


and on the slaves, the jps output is: 
hadoop@nutchcluster2:~/hadoop$ jps
8952 DataNode
9104 TaskTracker
9388 Jps
hadoop@nutchcluster2:~/hadoop$ 


This clearly indicates that the port 54310 is accessible from the master only and not from the slaves. This is the point I'm stuck at and would appreciate if someone could point me what config is missing or what is wrong. Any comment/feedback, in this regard would be highly appreciated. Thanks in advance. 


Regards, 
DW