hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4374) Fix child task environment variable config and add support for Windows
Date Mon, 09 Jun 2014 21:13:02 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025750#comment-14025750
] 

Jason Lowe commented on MAPREDUCE-4374:
---------------------------------------

I ran across a case where something broke moving from 0.23 to 2.x due to this change.  Due
to the regexp pattern matching/replacing of variable expansions added by this change, environment
variables are now expanded in the context of the code setting up the launch context rather
than in the container itself.  For example, occurrences of $PATH, $JAVA_HOME, $LD_LIBRARY_PATH,
etc. are resolved using the job client's environment when setting up the application submission
context to launch the AM, and likewise variables are expanded using the AM's environment when
setting up the container launch contexts for tasks.  Previously these variables were passed
through un-expanded and later evaluated using the environment of the container rather than
the code setting up the container launch context.

What this means is that any occurrences of variable references in environment settings now
needs to use the new \{\{ var \}\} syntax to note that these variables should only be expanded
at container launch time.  Any "normal" variable references will be expanded with the environment
of the code setting up the launch context which may not be intuitive to users, especially
given prior behavior in this area (although that code itself wasn't very consistent when it
came to variable expansion semantics).

Anyway I wanted to bring it up in case others run into a similar snag and wondering if there's
something to fix here. If we decide to keep the semantics as-is for 2.x then we should at
least document the new behavior and new syntax outside of the java code, e.g.: in mapred-default.xml
for the various env properties similar to what was done for mapreduce.application.classpath.

> Fix child task environment variable config and add support for Windows
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4374
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4374
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 3.0.0, 1-win, 2.1.0-beta
>            Reporter: Chuan Liu
>            Assignee: Chuan Liu
>            Priority: Minor
>             Fix For: 3.0.0, 1-win, 2.1.0-beta
>
>         Attachments: MAPREDUCE-4374-branch-1-win-2.patch, MAPREDUCE-4374-branch-1-win.patch,
MAPREDUCE-4374-trunk.2.patch, MAPREDUCE-4374-trunk.3.patch, MAPREDUCE-4374-trunk.patch
>
>
> In HADOOP-2838, a new feature was introduced to set environment variables via the Hadoop
config 'mapred.child.env' for child tasks. There are some further fixes and improvements around
this feature, e.g. HADOOP-5981 were a bug fix; MAPREDUCE-478 broke the config into 'mapred.map.child.env'
and 'mapred.reduce.child.env'.  However the current implementation is still not complete.
It does not match its documentation or original intend as I believe. Also, by using ‘:’
(colon) and ‘;’ (semicolon) in the configuration syntax, we will have problems using them
on Windows because ‘:’ appears very often in Windows path as in “C:\”, and environment
variables are used very often to hold path names. The Jira is created to fix the problem and
provide support on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message