hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <ja...@attributor.com>
Subject Re: More HOD questions 0.16.0 - debug log enclosed - help with how to debug - solved
Date Tue, 26 Feb 2008 15:33:31 GMT
Well, this finally started to work, after we learned how to debug.

There were 2 issues, 1, the torque scp command was passing 3 arguments 
instead of 2, and this was causing the error logs to get eaten.

On our master node, the dfs hod is installed in a different place that 
on the child nodes, and a symlink plased into the 'standard location'.
HOD/torque was forwarding the real location instead of the configured 
location.

To find out the SCP was failing, we had to up the debug level on the 
pbs_moms' by seding SIGUSR1's to them, 4 seemed sufficient, then look at 
the /var/log/messages to find the failure reports.

For the short term, we just made symlinks on the child nodes of where 
the virtual cluster was expecting to find the dfs configuration.



Hemanth Yamijala wrote:
> Jason Venner wrote:
>> My hadoop jobs don't start
>> This is configured to use an existing DFS and to unpack a tarball 
>> with a cut down 0.16.0 config
>> I have looked in the mom logs on the client machines and am not 
>> getting anything meaningful.
>>
> What is your hod command line ? Specifically, how did you provide the 
> tarball option ?
> Can you attach the log of the hod command, like you did the hodrc. 
> There are some lines in the output that don't seem complete.
> Set your debug option in the [ringmaster] section to 4, and rerun hod. 
> Under the log-dir specified in the [ringmaster] section you will be 
> able to see a log file corresponding to your jobid. Can you attach 
> that too ? The ringmaster node is the first one allocated by torque 
> for the job, that is, the mother superior for the job.
> How is your tarball built ? Can you check that there's no 
> hadoop-env.sh with pre-filled values in them. Look at HADOOP-2860.
>
> Thanks
> Hemanth
>
-- 
Jason Venner
Attributor - Publish with Confidence <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers, contact if interested

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message