hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Brennan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually
Date Wed, 04 Apr 2018 17:40:00 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425908#comment-16425908

Jim Brennan commented on MAPREDUCE-7069:

[~jlowe] thanks for the thorough review.  Much appreciated!
{quote}It's a bit odd and inefficient that setVMEnv calls MRApps.setEnvFromInputProperty twice.
I think it would be clearer and more efficient to call it once, place the results in a temporary
map (like it already does in the second call), then only set HADOOP_ROOT_LOGGER and HADOOP_CLIENT_OPTS
in the environment if they are not set in the temporary map. Then at the end we can simply
call addAll to dump the contents of the temporary map into the environment map.
The reason it was done with two calls is because of the way environment variables are handled
when they are already defined in the environment map. If you an environment variable that
you are updating already exists in the environment, setEnvFromInput* functions append the
new value to the existing value, using the appropriate separator. The special handling for
HADOOP_ROOT_LOGGER and HADOOP_CLIENT_OPTS is to overwrite them instead of appending.  That
said, I can definitely change it to do it the way you suggest, except I can't just use addAll()
- you ultimately need to use Apps.addToEnvironment on each k/v pair. I could expose an Apps.setEnvFromInputStringMapNoExpand()
(or add a noExpand boolean to the existing one) to handle this though.  

Thanks for the documentation/comment recommendations - I was going to ask about that - I'll
clean those up.
{quote}Nit: setEnvFromInputStringMap does not need to be public.
Will fix. In an earlier iteration I was calling this directly.
{quote}Would it be easier to call tmpEnv.addAll(inputMap) and pass tmpEnv instead of inputMap?
Then we don't need to explicitly iterate the map.
Yes.  I will make this change.

> Add ability to specify user environment variables individually
> --------------------------------------------------------------
>                 Key: MAPREDUCE-7069
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Major
>         Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch
> As reported in YARN-6830, it is currently not possible to specify an environment variable
that contains commas via {{mapreduce.map.env}}, mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to specify environment
variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org

View raw message