hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7677) HADOOP_CONF_DIR should not be automatically put in task environment
Date Wed, 03 Jan 2018 17:57:03 GMT

    [ https://issues.apache.org/jira/browse/YARN-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309995#comment-16309995

Eric Yang commented on YARN-7677:

[~ebadger] I think I understand your scenario better now.  When host and docker environments
runs two separate Hadoop clusters, we do not want the host Hadoop configuration to be exposed
to docker because disk settings and file system layout do not apply.  Other scenarios, such
as HDFS is outside of docker container, and running Spark python application in docker container
to access host level HDFS, Hadoop configuration should be inherited from host to make sure
well optimized timing settings are exposed.  For huge clusters, the first scenario maybe used
to isolate virtual clusters.  

For smaller clusters, it is most likely to run mix workload and use docker to isolate programming
libraries.  Host level node manager white list can not get overwritten by container.  I think
both cases can be supported, and the default is probably inheriting {{HADOOP_CONF_DIR}} for
smaller clusters to boost efficient utilization of system resource.  It would be better if
we build a switch for env_reset as part of job submission flag to disable Hadoop system environment
variable inheritance.

> HADOOP_CONF_DIR should not be automatically put in task environment
> -------------------------------------------------------------------
>                 Key: YARN-7677
>                 URL: https://issues.apache.org/jira/browse/YARN-7677
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Eric Badger
>            Assignee: Eric Badger
> Currently, {{HADOOP_CONF_DIR}} is being put into the task environment whether it's set
by the user or not. It completely bypasses the whitelist and so there is no way for a task
to not have {{HADOOP_CONF_DIR}} set. This causes problems in the Docker use case where Docker
containers will set up their own environment and have their own {{HADOOP_CONF_DIR}} preset
in the image itself. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message