hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually
Date Wed, 04 Apr 2018 16:04:00 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425770#comment-16425770
] 

Jason Lowe commented on MAPREDUCE-7069:
---------------------------------------

Thanks for the patch!

It's a bit odd and inefficient that setVMEnv calls MRApps.setEnvFromInputProperty twice. 
I think it would be clearer and more efficient to call it once, place the results in a temporary
map (like it already does in the second call), then only set HADOOP_ROOT_LOGGER and HADOOP_CLIENT_OPTS
in the environment if they are not set in the temporary map.  Then at the end we can simply
call addAll to dump the contents of the temporary map into the environment map.

The example documentation in JobConf is confusing.  It uses "MAPRED_MAP_TASK_ENV" and "MAPRED_REDUCE_TASK_ENV"
but those literal strings should not be used in the property name.  It would be clearer if
this used "mapreduce.map.env" and "mapreduce.reduce.env" in the examples.  Either that or
give the example in the Java realm with something like set(MAPRED_MAP_TASK_ENV + ".varName",
varValue) so it's clearly not a literal string in the property name.  My pereference is the
former.

The relevant property descriptions in mapred-default.xml should be updated to reflect the
new functionality.

It would be good to update MapReduceTutorial.md to document the options for passing environment
variables to tasks.

There are a number of comments in setEnvFromString that should be fixed up.  I realize this
is mostly cut-n-paste from the old setEnvFromInputString, but since we're refactoring it would
be nice to clean it up a bit in the process.  There's not such thing as a tt (tasktracker)
in YARN, and the comments imply this is only called to setup the env by a nodemanager for
a child process.  That's not always the case.  "note" s/b "not", etc.

For javadoc comments it's not necessary to state the type of the variable after the variable
name.  Javadoc can automatically extract this from the method signature.

Nit: setEnvFromInputStringMap does not need to be public.

Would it be easier to call tmpEnv.addAll(inputMap) and pass tmpEnv instead of inputMap?  Then
we don't need to explicitly iterate the map.

The unit test should add a new properies with commas and or equal signs in the value and verify
the values come through in the environment map.

Does it make sense to split some of the unit test up into separate tests?  For example the
null input test can easily stand by itself.  Separate tests make it easier to identify what's
working and what's broken rather than a stacktrace with a line number in the middle of a large
unit test that is testing many different aspects.


> Add ability to specify user environment variables individually
> --------------------------------------------------------------
>
>                 Key: MAPREDUCE-7069
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Major
>         Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an environment variable
that contains commas via {{mapreduce.map.env}}, mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to specify environment
variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message