hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Van Workum <v...@sabalcore.com>
Subject Re: hadoop on demand setup: Failed to retrieve 'hdfs' service address
Date Tue, 13 Apr 2010 02:52:21 GMT
On Mon, Apr 12, 2010 at 8:52 PM, Boyu Zhang <boyuzhang35@gmail.com> wrote:
> Hi Kevin,
>
> Sorry to bother again, I am wondering in order to get HOD to work, do we
> need to install all the prerequisite software like passwordless ssh? Thanks
> a lot!

SSH is not needed for HOD, it uses pbsdsh to launch processes on the nodes.

HOD seems to be very sensitive about the python version: 2.4 and 2.6
don't work, you need 2.5.

HOD is a little more flexible with Java, 1.5 and 1.6 seem to both work
for me. Also, the most recent versions of Twisted and zope seem to be
fine.

>
> Boyu
>
> On Tue, Apr 6, 2010 at 10:43 AM, Kevin Van Workum <vanw@sabalcore.com>wrote:
>
>> Hello,
>>
>> I'm trying to setup hadoop on demand (HOD) on my cluster. I'm
>> currently unable to "allocate cluster". I'm starting hod with the
>> following command:
>>
>> /usr/local/hadoop-0.20.2/hod/bin/hod -c
>> /usr/local/hadoop-0.20.2/hod/conf/hodrc -t
>> /b/01/vanw/hod/hadoop-0.20.2.tar.gz -o "allocate ~/hod 3"
>> --ringmaster.log-dir=/tmp -b 4
>>
>> The job starts on the nodes and I see the ringmaster running on the
>> MotherSuperior. The ringmaster-main.log file is created, but is empty.
>> I don't see any associated processes running on the other 2 nodes in
>> the job.
>>
>> The critical errors are as follows:
>>
>> [2010-04-06 10:34:13,630] CRITICAL/50 hadoop:298 - Failed to retrieve
>> 'hdfs' service address.
>> [2010-04-06 10:34:13,631] DEBUG/10 hadoop:631 - Cleaning up cluster id
>> 238366.jman, as cluster could not be allocated.
>> [2010-04-06 10:34:13,632] DEBUG/10 hadoop:635 - Calling rm.stop()
>> [2010-04-06 10:34:13,639] DEBUG/10 hadoop:637 - Returning from rm.stop()
>> [2010-04-06 10:34:13,639] CRITICAL/50 hod:401 - Cannot allocate
>> cluster /b/01/vanw/hod
>> [2010-04-06 10:34:14,149] DEBUG/10 hod:597 - return code: 7
>>
>> The contents of the hodrc file is:
>>
>> [hod]
>> stream                          = True
>> java-home                       = /usr/local/jdk1.6.0_02
>> cluster                         = orange
>> cluster-factor                  = 1.8
>> xrs-port-range                  = 32768-65536
>> debug                           = 4
>> allocate-wait-time              = 3600
>> temp-dir                        = /tmp/hod
>>
>> [ringmaster]
>> register                        = True
>> stream                          = False
>> temp-dir                        = /tmp/hod
>> http-port-range                 = 8000-9000
>> work-dirs                       = /tmp/hod/1,/tmp/hod/2
>> xrs-port-range                  = 32768-65536
>> debug                           = 4
>>
>> [hodring]
>> stream                          = False
>> temp-dir                        = /tmp/hod
>> register                        = True
>> java-home                       = /usr/local/jdk1.6.0_02
>> http-port-range                 = 8000-9000
>> xrs-port-range                  = 32768-65536
>> debug                           = 4
>>
>> [resource_manager]
>> queue                           = dque
>> batch-home                      = /usr/local/torque-2.3.7
>> id                              = torque
>> env-vars                       =
>> HOD_PYTHON_HOME=/usr/local/python-2.5.5/bin/python
>>
>> [gridservice-mapred]
>> external                        = False
>> pkgs                            = /usr/local/hadoop-0.20.2
>> tracker_port                    = 8030
>> info_port                       = 50080
>>
>> [gridservice-hdfs]
>> external                        = False
>> pkgs                            = /usr/local/hadoop-0.20.2
>> fs_port                         = 8020
>> info_port                       = 50070
>>
>>
>> Some other useful information:
>> Linux 2.6.18-128.7.1.el5
>> Python 2.5.5
>> Twisted 10.0.0
>> zope 3.3.0
>> java version "1.6.0_02"
>>
>> --
>> Kevin Van Workum, PhD
>> Sabalcore Computing Inc.
>> Run your code on 500 processors.
>> Sign up for a free trial account.
>> www.sabalcore.com
>> 877-492-8027 ext. 11
>>
>



-- 
Kevin Van Workum, PhD
Sabalcore Computing Inc.
Run your code on 500 processors.
Sign up for a free trial account.
www.sabalcore.com
877-492-8027 ext. 11

Mime
View raw message