hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: Re: Re: Re: Regarding "Hadoop multi cluster" set-up
Date Sat, 07 Feb 2009 17:06:20 GMT
On your master machine, use the netstat command to determine what ports and
addresses the namenode process is listening on.

On the datanode machines, examine the log files,, to verify that the
datanode has attempted to connect to the namenode ip address on one of those
ports, and was successfull.

The common ports used for datanode -> namenode rondevu are 50010, 54320 and
8020, depending on your hadoop version

If the datanodes have been started, and the connection to the namenode
failed, there will be a log message with a socket error, indicating what
host and port the datanode used to attempt to communicate with the namenode.
Verify that that ip address is correct for your namenode, and reachable from
the datanode host (for multi homed machines this can be an issue), and that
the port listed is one of the tcp ports that the namenode process is listing
on.

For linux, you can use command
*netstat -a -t -n -p | grep java | grep LISTEN*
to determine the ip addresses and ports and pids of the java processes that
are listening for tcp socket connections

and the jps command from the bin directory of your java installation to
determine the pid of the namenode.

On Sat, Feb 7, 2009 at 6:27 AM, shefali pawar <shefali_p@rediffmail.com>wrote:

> Hi,
>
> No, not yet. We are still struggling! If you find the solution please let
> me know.
>
> Shefali
>
> On Sat, 07 Feb 2009 02:56:15 +0530  wrote
> >I had to change the master on my running cluster and ended up with the
> same
> >problem. Were you able to fix it at your end?
> >
> >Amandeep
> >
> >
> >Amandeep Khurana
> >Computer Science Graduate Student
> >University of California, Santa Cruz
> >
> >
> >On Thu, Feb 5, 2009 at 8:46 AM, shefali pawar wrote:
> >
> >> Hi,
> >>
> >> I do not think that the firewall is blocking the port because it has
> been
> >> turned off on both the computers! And also since it is a random port
> number
> >> I do not think it should create a problem.
> >>
> >> I do not understand what is going wrong!
> >>
> >> Shefali
> >>
> >> On Wed, 04 Feb 2009 23:23:04 +0530  wrote
> >> >I'm not certain that the firewall is your problem but if that port is
> >> >blocked on your master you should open it to let communication through.
> >> Here
> >> >is one website that might be relevant:
> >> >
> >> >
> >>
> http://stackoverflow.com/questions/255077/open-ports-under-fedora-core-8-for-vmware-server
> >> >
> >> >but again, this may not be your problem.
> >> >
> >> >John
> >> >
> >> >On Wed, Feb 4, 2009 at 12:46 PM, shefali pawar wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> I will have to check. I can do that tomorrow in college. But if that
> is
> >> the
> >> >> case what should i do?
> >> >>
> >> >> Should i change the port number and try again?
> >> >>
> >> >> Shefali
> >> >>
> >> >> On Wed, 04 Feb 2009 S D wrote :
> >> >>
> >> >> >Shefali,
> >> >> >
> >> >> >Is your firewall blocking port 54310 on the master?
> >> >> >
> >> >> >John
> >> >> >
> >> >> >On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar > >wrote:
> >> >> >
> >> >> > > Hi,
> >> >> > >
> >> >> > > I am trying to set-up a two node cluster using Hadoop0.19.0,
with
> 1
> >> >> > > master(which should also work as a slave) and 1 slave node.
> >> >> > >
> >> >> > > But while running bin/start-dfs.sh the datanode is not starting
> on
> >> the
> >> >> > > slave. I had read the previous mails on the list, but nothing
> seems
> >> to
> >> >> be
> >> >> > > working in this case. I am getting the following error in
the
> >> >> > > hadoop-root-datanode-slave log file while running the command
> >> >> > > bin/start-dfs.sh =>
> >> >> > >
> >> >> > > 2009-02-03 13:00:27,516 INFO
> >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> >> >> > > /************************************************************
> >> >> > > STARTUP_MSG: Starting DataNode
> >> >> > > STARTUP_MSG:  host = slave/172.16.0.32
> >> >> > > STARTUP_MSG:  args = []
> >> >> > > STARTUP_MSG:  version = 0.19.0
> >> >> > > STARTUP_MSG:  build =
> >> >> > >
> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19-r
> >> >> > > 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
> >> >> > > ************************************************************/
> >> >> > > 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 0 time(s).
> >> >> > > 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 1 time(s).
> >> >> > > 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 2 time(s).
> >> >> > > 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 3 time(s).
> >> >> > > 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 4 time(s).
> >> >> > > 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 5 time(s).
> >> >> > > 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 6 time(s).
> >> >> > > 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 7 time(s).
> >> >> > > 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 8 time(s).
> >> >> > > 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> >> connect
> >> >> > > to server: master/172.16.0.46:54310. Already tried 9 time(s).
> >> >> > > 2009-02-03 13:00:37,738 ERROR
> >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode:
> >> java.io.IOException:
> >> >> Call
> >> >> > > to master/172.16.0.46:54310 failed on local exception: No
route
> to
> >> >> host
> >> >> > >        at org.apache.hadoop.ipc.Client.call(Client.java:699)
> >> >> > >        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
> >> >> > >        at $Proxy4.getProtocolVersion(Unknown Source)
> >> >> > >        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
> >> >> > >        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
> >> >> > >        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
> >> >> > >        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288)
> >> >> > >        at
> >> >> > >
> >> >>
> >>
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258)
> >> >> > >        at
> >> >> > >
> >> >> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205)
> >> >> > >        at
> >> >> > >
> >> >>
> >>
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
> >> >> > >        at
> >> >> > >
> >> >>
> >>
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
> >> >> > >        at
> >> >> > >
> >> >>
> >>
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
> >> >> > >        at
> >> >> > >
> >> >>
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)
> >> >> > > Caused by: java.net.NoRouteToHostException: No route to host
> >> >> > >        at sun.nio.ch.SocketChannelImpl.checkConnect(Native
> Method)
> >> >> > >        at
> >> >> > >
> >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> >> >> > >        at
> sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
> >> >> > >        at
> >> >> > >
> >> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)
> >> >> > >        at
> >> >> > >
> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
> >> >> > >        at
> >> org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
> >> >> > >        at org.apache.hadoop.ipc.Client.call(Client.java:685)
> >> >> > >        ... 12 more
> >> >> > >
> >> >> > > 2009-02-03 13:00:37,739 INFO
> >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> >> >> > > /************************************************************
> >> >> > > SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32
> >> >> > > ************************************************************/
> >> >> > >
> >> >> > >
> >> >> > > Also, the Pseudo distributed operation is working on both
the
> >> machines.
> >> >> And
> >> >> > > i am able to ssh from 'master to master' and 'master to slave'
> via a
> >> >> > > password-less ssh login. I do not think there is any problem
with
> >> the
> >> >> > > network because cross pinging is working fine.
> >> >> > >
> >> >> > > I am working on Linux (Fedora 8)
> >> >> > >
> >> >> > > The following is the configuration which i am using
> >> >> > >
> >> >> > > On master and slave, /conf/masters looks like this:
> >> >> > >
> >> >> > >  master
> >> >> > >
> >> >> > > On master and slave, /conf/slaves looks like this:
> >> >> > >
> >> >> > >  master
> >> >> > >  slave
> >> >> > >
> >> >> > > On both the machines conf/hadoop-site.xml looks like this
> >> >> > >
> >> >> > >
> >> >> > >  fs.default.name
> >> >> > >  hdfs://master:54310
> >> >> > >  The name of the default file system.  A URI whose
> >> >> > >  scheme and authority determine the FileSystem implementation.
>  The
> >> >> > >  uri's scheme determines the config property (fs.SCHEME.impl)
> naming
> >> >> > >  the FileSystem implementation class.  The uri's authority
is
> used
> >> to
> >> >> > >  determine the host, port, etc. for a filesystem.
> >> >> > >
> >> >> > >
> >> >> > >  mapred.job.tracker
> >> >> > >  master:54311
> >> >> > >  The host and port that the MapReduce job tracker runs
> >> >> > >  at.  If "local", then jobs are run in-process as a single
map
> >> >> > >  and reduce task.
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > >  dfs.replication
> >> >> > >  2
> >> >> > >  Default block replication.
> >> >> > >  The actual number of replications can be specified when
the file
> is
> >> >> > > created.
> >> >> > >  The default is used if replication is not specified in create
> time.
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > namenode is formatted succesfully by running
> >> >> > >
> >> >> > > "bin/hadoop namenode -format"
> >> >> > >
> >> >> > > on the master node.
> >> >> > >
> >> >> > > I am new to Hadoop and I do not know what is going wrong.
> >> >> > >
> >> >> > > Any help will be appreciated.
> >> >> > >
> >> >> > > Thanking you in advance
> >> >> > >
> >> >> > > Shefali Pawar
> >> >> > > Pune, India
> >> >> > >
> >> >>
> >> >>
> >> >>
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message