hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Milne <d.n.mi...@gmail.com>
Subject Re: Problems with HOD and HDFS
Date Mon, 14 Jun 2010 02:33:18 GMT
Anybody? I am completely stuck here. I have no idea who else I can ask
or where I can go for more information. Is there somewhere specific
where I should be asking about HOD?

Thank you,
Dave

On Thu, Jun 10, 2010 at 2:56 PM, David Milne <d.n.milne@gmail.com> wrote:
> Hi there,
>
> I am trying to get Hadoop on Demand up and running, but am having
> problems with the ringmaster not being able to communicate with HDFS.
>
> The output from the hod allocate command ends with this, with full verbosity:
>
> [2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve
> 'hdfs' service address.
> [2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id
> 34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated.
> [2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop()
> [2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop()
> [2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate
> cluster /home/dmilne/hadoop/cluster
> [2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7
>
>
> I've attached the hodrc file below, but briefly HOD is supposed to
> provision an HDFS cluster as well as a Map/Reduce cluster, and seems
> to be failing to do so. The ringmaster log looks like this:
>
> [2010-06-10 14:36:05,144] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs
> [2010-06-10 14:36:05,145] DEBUG/10 ringMaster:487 - getServiceAddr
> service: <hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8>
> [2010-06-10 14:36:05,147] DEBUG/10 ringMaster:504 - getServiceAddr
> addr hdfs: not found
> [2010-06-10 14:36:06,195] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs
> [2010-06-10 14:36:06,197] DEBUG/10 ringMaster:487 - getServiceAddr
> service: <hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8>
> [2010-06-10 14:36:06,198] DEBUG/10 ringMaster:504 - getServiceAddr
> addr hdfs: not found
>
> ... and so on, until it gives up
>
> Any ideas why? One red flag is that when running the allocate command,
> some of the variables echo-ed back look dodgy:
>
> --gridservice-hdfs.fs_port 0
> --gridservice-hdfs.host localhost
> --gridservice-hdfs.info_port 0
>
> These are not what I specified in the hodrc. Are the port numbers just
> set to 0 because I am not using an external HDFS, or is this a
> problem?
>
>
> The software versions involved are:
>  - Hadoop 0.20.2
>  - Python 2.5.2 (no Twisted)
>  - Java 1.6.0_20
>  - Torque 2.4.5
>
>
> The hodrc file looks like this:
>
> [hod]
> stream                          = True
> java-home                       = /opt/jdk1.6.0_20
> cluster                         = debian5
> cluster-factor                  = 1.8
> xrs-port-range                  = 32768-65536
> debug                           = 3
> allocate-wait-time              = 3600
> temp-dir                        = /scratch/local/dmilne/hod
>
> [ringmaster]
> register                        = True
> stream                          = False
> temp-dir                        = /scratch/local/dmilne/hod
> log-dir                         = /scratch/local/dmilne/hod/log
> http-port-range                 = 8000-9000
> idleness-limit                  = 864000
> work-dirs                       =
> /scratch/local/dmilne/hod/1,/scratch/local/dmilne/hod/2
> xrs-port-range                  = 32768-65536
> debug                           = 4
>
> [hodring]
> stream                          = False
> temp-dir                        = /scratch/local/dmilne/hod
> log-dir                         = /scratch/local/dmilne/hod/log
> register                        = True
> java-home                       = /opt/jdk1.6.0_20
> http-port-range                 = 8000-9000
> xrs-port-range                  = 32768-65536
> debug                           = 4
>
> [resource_manager]
> queue                           = express
> batch-home                      = /opt/torque-2.4.5
> id                              = torque
> options                         = l:pmem=3812M,W:X="NACCESSPOLICY:SINGLEJOB"
> #env-vars                       =
> HOD_PYTHON_HOME=/foo/bar/python-2.5.1/bin/python
>
> [gridservice-mapred]
> external                        = False
> pkgs                            = /opt/hadoop-0.20.2
> tracker_port                    = 8030
> info_port                       = 50080
>
> [gridservice-hdfs]
> external                        = False
> pkgs                            = /opt/hadoop-0.20.2
> fs_port                         = 8020
> info_port                       = 50070
>
> Cheers,
> Dave
>

Mime
View raw message