hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michel Tourn (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-463) variable expansion in Configuration
Date Mon, 28 Aug 2006 21:59:23 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-463?page=all ]

Michel Tourn updated HADOOP-463:
--------------------------------

    Attachment: confvar.patch

The patch implements the variable expansion described in this issue.
There is also a junit test.

On my shared client machine I use it like this.
(${user.name} expands to the System property "michel")

<property>
  <name>tmp.base</name>
  <value>/tmp/${user.name}</value>
</property>

<property>
  <name>dfs.name.dir</name>
  <value>${tmp.base}/hadoop/dfs/name</value>
</property>

etc.

> variable expansion in Configuration
> -----------------------------------
>
>                 Key: HADOOP-463
>                 URL: http://issues.apache.org/jira/browse/HADOOP-463
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Michel Tourn
>         Attachments: confvar.patch
>
>
> Add variable expansion to Configuration class.
> =================
> This is necessary for shared, client-side configurations:
> A Job submitter (an HDFS client) requires:
> <name>dfs.data.dir</name><value>/tmp/${user.name}/dfs</value>
> A local-mode mapreduce requires:
> <name>mapred.temp.dir</name><value>/tmp/${user.name}/mapred/tmp</value>
> Why this is necessary :
> =================
> Currently we use shared directories like:
> <name>dfs.data.dir</name><value>/tmp/dfs</value>
> This superficially seems to work.
> After all, different JVM clients create their own private subdirectory map_xxxx., so
they will not conflict.
> What really happens:
> 1. /tmp/ is world-writable, as it's supposed to.
> 2. Hadoop will create missing subdirectories. 
> This is Java so that for ex. /tmp/system is created as writable only by the JVM process
user
> 3. This is a shared client machine so next user's JVM will find /tmp/system owned by
somebody else. Creating a directory within /tmp/system fails
> Implementation of var expansion
> =============
> in class Configuration, 
> The Properties really store things like put("banner", "hello ${user.name}");
> In public String get(String name): postprocess the returned value:
> Use a regexp to find the pattern ${xxxx}
> Lookup xxxx as a system property
> If found, replace ${xxxx} by the system property value.
> Else leave as-is. An unexpanded ${xxxx} is a hint that the variable name is invalid.
> Other workarounds 
> ===============
> The other proposed workarounds are not as elegant as variable expansion.
> Workaround 1: 
> have an installation script which does:
> mkdir /tmp/dfs
> chmod uga+rw /tmp/dfs
> repeat for ALL configured subdirectories at ANY nesting level
> keep the script in sync with changes to hadoop XML configuration files.
> Support the script on non-Unix platform
> Make sure the installtion script runs before Hadoop runs for the first time.
> If users change the permissions/delete any of the shared directories, it breaks again.
> Workaround 2: 
> do the chmod operations from within the Hadoop code.
> In pure java 1.4, 1.5 this is not possible.
> It requires the Hadoop client process to have chmod privilege (rather than just mkdir
privilege)
> It requires to special-case directory creation code.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message