hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Milne <d.n.mi...@gmail.com>
Subject Problems with HOD and HDFS
Date Thu, 10 Jun 2010 02:56:57 GMT
Hi there,

I am trying to get Hadoop on Demand up and running, but am having
problems with the ringmaster not being able to communicate with HDFS.

The output from the hod allocate command ends with this, with full verbosity:

[2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve
'hdfs' service address.
[2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id
34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated.
[2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop()
[2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop()
[2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate
cluster /home/dmilne/hadoop/cluster
[2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7


I've attached the hodrc file below, but briefly HOD is supposed to
provision an HDFS cluster as well as a Map/Reduce cluster, and seems
to be failing to do so. The ringmaster log looks like this:

[2010-06-10 14:36:05,144] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs
[2010-06-10 14:36:05,145] DEBUG/10 ringMaster:487 - getServiceAddr
service: <hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8>
[2010-06-10 14:36:05,147] DEBUG/10 ringMaster:504 - getServiceAddr
addr hdfs: not found
[2010-06-10 14:36:06,195] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs
[2010-06-10 14:36:06,197] DEBUG/10 ringMaster:487 - getServiceAddr
service: <hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8>
[2010-06-10 14:36:06,198] DEBUG/10 ringMaster:504 - getServiceAddr
addr hdfs: not found

... and so on, until it gives up

Any ideas why? One red flag is that when running the allocate command,
some of the variables echo-ed back look dodgy:

--gridservice-hdfs.fs_port 0
--gridservice-hdfs.host localhost
--gridservice-hdfs.info_port 0

These are not what I specified in the hodrc. Are the port numbers just
set to 0 because I am not using an external HDFS, or is this a
problem?


The software versions involved are:
 - Hadoop 0.20.2
 - Python 2.5.2 (no Twisted)
 - Java 1.6.0_20
 - Torque 2.4.5


The hodrc file looks like this:

[hod]
stream                          = True
java-home                       = /opt/jdk1.6.0_20
cluster                         = debian5
cluster-factor                  = 1.8
xrs-port-range                  = 32768-65536
debug                           = 3
allocate-wait-time              = 3600
temp-dir                        = /scratch/local/dmilne/hod

[ringmaster]
register                        = True
stream                          = False
temp-dir                        = /scratch/local/dmilne/hod
log-dir                         = /scratch/local/dmilne/hod/log
http-port-range                 = 8000-9000
idleness-limit                  = 864000
work-dirs                       =
/scratch/local/dmilne/hod/1,/scratch/local/dmilne/hod/2
xrs-port-range                  = 32768-65536
debug                           = 4

[hodring]
stream                          = False
temp-dir                        = /scratch/local/dmilne/hod
log-dir                         = /scratch/local/dmilne/hod/log
register                        = True
java-home                       = /opt/jdk1.6.0_20
http-port-range                 = 8000-9000
xrs-port-range                  = 32768-65536
debug                           = 4

[resource_manager]
queue                           = express
batch-home                      = /opt/torque-2.4.5
id                              = torque
options                         = l:pmem=3812M,W:X="NACCESSPOLICY:SINGLEJOB"
#env-vars                       =
HOD_PYTHON_HOME=/foo/bar/python-2.5.1/bin/python

[gridservice-mapred]
external                        = False
pkgs                            = /opt/hadoop-0.20.2
tracker_port                    = 8030
info_port                       = 50080

[gridservice-hdfs]
external                        = False
pkgs                            = /opt/hadoop-0.20.2
fs_port                         = 8020
info_port                       = 50070

Cheers,
Dave

Mime
View raw message