hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seb Seith <sase...@gmail.com>
Subject Re: HOD client-side hodring issue
Date Thu, 27 Aug 2009 20:11:54 GMT
I just realized that it finally posted my original message moments
after I resent it.  I had assumed after that period of time that it
had not been successfully received by the system since I had not seen
it come up on the archive or on the list itself.  Sorry for the double
posting there.

On Thu, Aug 27, 2009 at 3:01 PM, Seb Seith<saseith@gmail.com> wrote:
> Hello,
>   I have been working on setting up a Hadoop on Demand cluster on
> three machines and have run into a bit of a snag.  I went through the
> admin and user guides and have successfully installed torque and HOD.
> When I run "hod allocate" it successfully starts hodring on 2 of the
> machines but not the third.  The result is that I have a working
> Namenode and Jobtracker (though its UI does not seem to work
> presently) but no slave nodes.
>   Even at level 4 debug in all sections there is nothing to indicate
> a failure as the ringmaster has no problem communicating with the
> running hodring jobs and pbsdsh returns without error.  I can find no
> logs on any of the machines indicating a torque issue (though I admit
> I am not terribly familiar with torque) and no logs at all for HOD on
> the machine that is not running hodring.
>   It would appear that the pbsdsh job simply isn't starting hodring
> on the one node given the lack of any HOD log on that machine.  Either
> it is not recognizing the node (seems somewhat unlikely as it comes up
> in pbsnodes as free) or there is a relatively silent failure
> somewhere.  If you have any suggestions I would much appreciate them.
>   One quick side note is I have successfully run a standard
> hadoop-0.20.0 cluster on these three machines with no difficulty,
> which should rule out connection, ssh or firewall issues.
> Thanks,
> Seb

View raw message