hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boyu Zhang <boyuzhan...@gmail.com>
Subject Re: hadoop on demand setup: Failed to retrieve 'hdfs' service address
Date Tue, 13 Apr 2010 14:43:16 GMT
Thanks a lot for the reply. I will go over the scripts to see the details.

And I am a little confused about which process starts the hdfs and mapreduce
daemons, the ringmaster or the hodring?

And I am wondering how do the hadoop daemons work once they are up. Do they
communicate in the same way without HOD (daemons talk to each other ) or
everything has to go through ringmaster and hodring? Thanks a lot for the
time and help!

Boyu

On Mon, Apr 12, 2010 at 10:52 PM, Kevin Van Workum <vanw@sabalcore.com>wrote:

> On Mon, Apr 12, 2010 at 8:52 PM, Boyu Zhang <boyuzhang35@gmail.com> wrote:
> > Hi Kevin,
> >
> > Sorry to bother again, I am wondering in order to get HOD to work, do we
> > need to install all the prerequisite software like passwordless ssh?
> Thanks
> > a lot!
>
> SSH is not needed for HOD, it uses pbsdsh to launch processes on the nodes.
>
> HOD seems to be very sensitive about the python version: 2.4 and 2.6
> don't work, you need 2.5.
>
> HOD is a little more flexible with Java, 1.5 and 1.6 seem to both work
> for me. Also, the most recent versions of Twisted and zope seem to be
> fine.
>
> >
> > Boyu
> >
> > On Tue, Apr 6, 2010 at 10:43 AM, Kevin Van Workum <vanw@sabalcore.com
> >wrote:
> >
> >> Hello,
> >>
> >> I'm trying to setup hadoop on demand (HOD) on my cluster. I'm
> >> currently unable to "allocate cluster". I'm starting hod with the
> >> following command:
> >>
> >> /usr/local/hadoop-0.20.2/hod/bin/hod -c
> >> /usr/local/hadoop-0.20.2/hod/conf/hodrc -t
> >> /b/01/vanw/hod/hadoop-0.20.2.tar.gz -o "allocate ~/hod 3"
> >> --ringmaster.log-dir=/tmp -b 4
> >>
> >> The job starts on the nodes and I see the ringmaster running on the
> >> MotherSuperior. The ringmaster-main.log file is created, but is empty.
> >> I don't see any associated processes running on the other 2 nodes in
> >> the job.
> >>
> >> The critical errors are as follows:
> >>
> >> [2010-04-06 10:34:13,630] CRITICAL/50 hadoop:298 - Failed to retrieve
> >> 'hdfs' service address.
> >> [2010-04-06 10:34:13,631] DEBUG/10 hadoop:631 - Cleaning up cluster id
> >> 238366.jman, as cluster could not be allocated.
> >> [2010-04-06 10:34:13,632] DEBUG/10 hadoop:635 - Calling rm.stop()
> >> [2010-04-06 10:34:13,639] DEBUG/10 hadoop:637 - Returning from rm.stop()
> >> [2010-04-06 10:34:13,639] CRITICAL/50 hod:401 - Cannot allocate
> >> cluster /b/01/vanw/hod
> >> [2010-04-06 10:34:14,149] DEBUG/10 hod:597 - return code: 7
> >>
> >> The contents of the hodrc file is:
> >>
> >> [hod]
> >> stream                          = True
> >> java-home                       = /usr/local/jdk1.6.0_02
> >> cluster                         = orange
> >> cluster-factor                  = 1.8
> >> xrs-port-range                  = 32768-65536
> >> debug                           = 4
> >> allocate-wait-time              = 3600
> >> temp-dir                        = /tmp/hod
> >>
> >> [ringmaster]
> >> register                        = True
> >> stream                          = False
> >> temp-dir                        = /tmp/hod
> >> http-port-range                 = 8000-9000
> >> work-dirs                       = /tmp/hod/1,/tmp/hod/2
> >> xrs-port-range                  = 32768-65536
> >> debug                           = 4
> >>
> >> [hodring]
> >> stream                          = False
> >> temp-dir                        = /tmp/hod
> >> register                        = True
> >> java-home                       = /usr/local/jdk1.6.0_02
> >> http-port-range                 = 8000-9000
> >> xrs-port-range                  = 32768-65536
> >> debug                           = 4
> >>
> >> [resource_manager]
> >> queue                           = dque
> >> batch-home                      = /usr/local/torque-2.3.7
> >> id                              = torque
> >> env-vars                       =
> >> HOD_PYTHON_HOME=/usr/local/python-2.5.5/bin/python
> >>
> >> [gridservice-mapred]
> >> external                        = False
> >> pkgs                            = /usr/local/hadoop-0.20.2
> >> tracker_port                    = 8030
> >> info_port                       = 50080
> >>
> >> [gridservice-hdfs]
> >> external                        = False
> >> pkgs                            = /usr/local/hadoop-0.20.2
> >> fs_port                         = 8020
> >> info_port                       = 50070
> >>
> >>
> >> Some other useful information:
> >> Linux 2.6.18-128.7.1.el5
> >> Python 2.5.5
> >> Twisted 10.0.0
> >> zope 3.3.0
> >> java version "1.6.0_02"
> >>
> >> --
> >> Kevin Van Workum, PhD
> >> Sabalcore Computing Inc.
> >> Run your code on 500 processors.
> >> Sign up for a free trial account.
> >> www.sabalcore.com
> >> 877-492-8027 ext. 11
> >>
> >
>
>
>
> --
> Kevin Van Workum, PhD
> Sabalcore Computing Inc.
> Run your code on 500 processors.
> Sign up for a free trial account.
> www.sabalcore.com
> 877-492-8027 ext. 11
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message