hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2862) [HOD] Support PBS env vars in hod configuration
Date Fri, 22 Feb 2008 04:06:19 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571275#action_12571275

Hemanth Yamijala commented on HADOOP-2862:

HOD currently uses two separate configuration options for temp space. One is the temp-dir
which it uses for HOD specific temporary work. The other is work-dirs which it uses for Hadoop
specific work space. It is in the latter that HDFS and Mapred work directories are set up.
The value specified in the configuration file for these options is the root, under which directories
are set up per job, owned byu the user. These are deleted at the end of the job - atleast,
AFAIK, the data is deleted, but sometimes the directories themselves are not cleaned up which
looks like a bug in HOD.

That said, I see what you are proposing has 2 advantages:

- Currently, HOD requires the temp directory to be world writable, so different users can
write to it. With your method, we longer need that requirement. It seems cleaner.
- We are ensured of clean-up of the directories as well. Though, I think HOD should still
take responsibility of cleaning up what it creates.

It seems, therefore, like a useful addition.

> [HOD] Support PBS env vars in hod configuration
> -----------------------------------------------
>                 Key: HADOOP-2862
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2862
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>         Environment: Torque PBS
>            Reporter: Craig Macdonald
> In some batch environments, eg using Torque PBS, scratch spaces are provided on cluster
nodes for where jobs should put their temporary files. These are automatically cleaned up
when the job exists by an epilogue script.
> For instance, in our local Torque cluster, all nodes have a /scratch partition. For each
job, the prologue script creates a scratch folder owned by the user at /scratch/pbstmp.$PBS_JOBID
- $PBS_JOBID is then the env var containing the job id, as set by pbs_mom.
> Would it be possible to use these env vars in the configuration of hod. For instance,
say I want to create an hdfs on demand using hod, but that the hdfs space should be in /scratch/pbstmp.$PBS_JOBID,
not in /tmp/hod say. This would involve HOD supporting env vars in configuration, but knowing
when to substitute the env var with it's current value (ie not until running on the correct
node where the operation should take place).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message