hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Merto Mertek <masmer...@gmail.com>
Subject Re: slave nodes could not connect to Master .
Date Wed, 26 Oct 2011 21:05:04 GMT
Ok, problem seems to be solved. In the core-site.xml file on the namenode I
left value for fs.name.default pointing to localhost. After changing it to
hadoomaster, and restarting it I could see other nodes too..

On 26 October 2011 04:02, Merto Mertek <masmertoz@gmail.com> wrote:

> Same problem here.. Sachinites have you found a solution?
> I have three nodes where situation is the same as described above.. Pinging
> and ssh-ing to the master node works ok, however logs are full of the
> following info:
>
> "INFO org.apache.hadoop.ipc.Client: Retrying connect to server:
> hadoopmaster/192.168.12.1:54310"
>
> Jobtracker sees just one node (http://hadoopmaster:50030/jobtracker.jsp ->
> probably the local one). Hadoopmaster node has the following process: "
> 127.0.0.1:54310         0.0.0.0:*               LISTEN      1001
> 17909       5131/java " so the problem is probably elsewhere..
>
> Do you have any idea what could be wrong?
>
>
> On 11 August 2011 12:19, V@ni <vanitham89@gmail.com> wrote:
>
>>
>> Hi, This can be SSH configuration error. Reinstall SSH and try connecting
>> to
>> the master node. Hope it works
>>
>> May be dis link might be useful
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>>
>>
>>
>>
>> sachinites wrote:
>> >
>> > Sir , I tried everywhere on all forums , but could not resoleve this
>> > problem . please help me out .
>> >
>> > I followed your tutorial to run the hadoop , first running a datanode
>> > locally on the same machine . and it all worked fine .
>> >
>> > Then , i configured hadoop to run the namenode , secondarynamenode & and
>> a
>> > job tracker and a datanode on one machine , the master , and other two
>> > machines as slaves . Total of three datanodes . start-all.sh
>> successfully
>> > start all daemons on all required nodes .
>> > But is seems , that only the datanode running locally on my machine is
>> > executing the whole job , the rest two slaves are starving for the
>> > connection with the masters. Their tasktracker logs reads (same on both
>> > slaves ):
>> >
>> > 2011-08-09 07:09:54,099 INFO org.apache.hadoop.ipc.Server: IPC Server
>> > Responder: starting
>> > 2011-08-09 07:09:54,100 INFO org.apache.hadoop.ipc.Server: IPC Server
>> > listener on 37874: starting
>> > 2011-08-09 07:09:54,103 INFO org.apache.hadoop.ipc.Server: IPC Server
>> > handler 0 on 37874: starting
>> > 2011-08-09 07:09:54,104 INFO org.apache.hadoop.ipc.Server: IPC Server
>> > handler 1 on 37874: starting
>> > 2011-08-09 07:09:54,104 INFO org.apache.hadoop.ipc.Server: IPC Server
>> > handler 2 on 37874: starting
>> > 2011-08-09 07:09:54,105 INFO org.apache.hadoop.ipc.Server: IPC Server
>> > handler 3 on 37874: starting
>> > 2011-08-09 07:09:54,105 INFO org.apache.hadoop.mapred.TaskTracker:
>> > TaskTracker up at: localhost/127.0.0.1:37874
>> > 2011-08-09 07:09:54,105 INFO org.apache.hadoop.mapred.TaskTracker:
>> > Starting tracker tracker_gislab-desktop:localhost/127.0.0.1:37874
>> > 2011-08-09 07:09:55,145 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 0 time(s).
>> > 2011-08-09 07:09:56,146 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 1 time(s).
>> > 2011-08-09 07:09:57,147 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 2 time(s).
>> > 2011-08-09 07:09:58,148 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 3 time(s).
>> > 2011-08-09 07:09:59,149 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 4 time(s).
>> > 2011-08-09 07:10:00,150 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 5 time(s).
>> > 2011-08-09 07:10:01,151 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 6 time(s).
>> > 2011-08-09 07:10:02,151 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 7 time(s).
>> > 2011-08-09 07:10:03,152 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 8 time(s).
>> > 2011-08-09 07:10:04,153 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 9 time(s).
>> > 2011-08-09 07:10:04,156 INFO org.apache.hadoop.ipc.RPC: Server at
>> > /10.14.11.32:9001 not available yet, Zzzzz...
>> > 2011-08-09 07:10:06,158 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 0 time(s).
>> > 2011-08-09 07:10:07,159 INFO org.apache.hadoop.ipc.Client: Retrying
>> > connect to server: /10.14.11.32:9001. Already tried 1 time(s).
>> >
>> > .... and continues. I tried re-enable the ipv6 address , but in vain .
>> The
>> > data in HDFS is distributed on all datanodes though. This confirms its
>> not
>> > a network problem also . I can password-lessely ssh in all directions .
>> >
>> > Sir, please help . I shall be grateful to you.
>> >
>> > Abhishek
>> > Masters Student
>> > IIT Bombay
>> > INDIA
>> >
>> >
>>
>> --
>> View this message in context:
>> http://old.nabble.com/slave-nodes-could-not-connect-to-Master-.-tp32223105p32240913.html
>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message