hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@yahoo-inc.com>
Subject Re: Problems running a HOD test cluster
Date Fri, 22 Feb 2008 17:34:01 GMT
On 2/21/08 10:52 AM, "Luca" <raskolnikoff77@yahoo.it> wrote:
> A few questions:
> - is Java6 ok for HOD?

    That's what we use.

> - I have an externally running HDFS cluster, as specified in
> [gridservice-hdfs]: how do I find out the fs_port of my cluster? IS it
> something specified in the hadoop-site.xml file?

    Yup.

> - what should I expect at the end of an allocate command? Currently what
> I get is the output above, but should I in theory return back to the
> shell prompt, to issue an hadoop command?

    With HOD 0.4, yes.


> [2008-02-21 19:46:11,014] ERROR/40 torque:96 - qstat error: exit code:
> 153 | signal: False | core False
> [2008-02-21 19:46:11,017] INFO/20 hadoop:451 - Ringmaster at : None.

    I bet your ringmaster didn't come up.  Check which nodes were allocated
to your job via qstat -f.  Chances are good the first one is the ringmaster
node.  Check the torque logs, syslogs, and the hod log dir for hints as to
what happened.


> [2008-02-21 19:46:11,021] INFO/20 hadoop:530 - Cleaning up job id
> 207.server.com, as cluster could not be allocated.
> [2008-02-21 19:46:11,025] DEBUG/10 torque:131 - /usr/bin/qdel 207.server.com
> [2008-02-21 19:46:13,079] CRITICAL/50 hod:253 - Cannot allocate cluster
> /mnt/scratch/grid/test
> [2008-02-21 19:46:13,940] DEBUG/10 hod:391 - return code: 6


Mime
View raw message