flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6222) YARN: setting environment variables in an easier fashion
Date Mon, 23 Jul 2018 19:29:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553296#comment-16553296

ASF GitHub Bot commented on FLINK-6222:

Github user zentol commented on the issue:

    If a feature isn't visibly documented chances are no one will use it ;)
    I'm not sure if the configuration page is the right place to put it, as it so far deals
exclusively with settings set in `flink-conf.yaml`. Most notable this line in the introduction
sticks out:
    All configuration is done in conf/flink-conf.yaml, which is expected to be a flat collection
of YAML key value pairs with format key: value.
    You could name it `flink-client-env-sh`, that would make it make it more obvious that
it only applies to the client.
    However i have to ask, why a separate file in the first place? We already have config
options for setting environment variables (`env.java.opts`); couldn't we introduce a separate
option for clients?

> YARN: setting environment variables in an easier fashion
> --------------------------------------------------------
>                 Key: FLINK-6222
>                 URL: https://issues.apache.org/jira/browse/FLINK-6222
>             Project: Flink
>          Issue Type: Improvement
>          Components: Startup Shell Scripts
>    Affects Versions: 1.2.0
>         Environment: YARN, EMR
>            Reporter: Craig Foster
>            Assignee: Dawid Wysakowicz
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: patch0-add-yarn-hadoop-conf.diff
> Right now we require end-users to set YARN_CONF_DIR or HADOOP_CONF_DIR and sometimes
> For example, in [1], it is stated: 
> “Please note that the Client requires the YARN_CONF_DIR or HADOOP_CONF_DIR environment
variable to be set to read the YARN and HDFS configuration.” 
> In BigTop, we set this with /etc/flink/default and then a wrapper is created to source
that. However, this is slightly cumbersome and we don't have a central place within the Flink
project itself to source environment variables. config.sh could do this but it doesn't have
information about FLINK_CONF_DIR. For YARN and Hadoop variables, I already have a solution
that would add "env.yarn.confdir" and "env.hadoop.confdir" variables to the flink-conf.yaml
file and then we just symlink /etc/lib/flink/conf/ and /etc/flink/conf. 
> But we could also add a flink-env.sh file to set these variables and decouple them from
config.sh entirely. 
> I'd like to know the opinion/preference of others and what would be more amenable. 

This message was sent by Atlassian JIRA

View raw message