airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Sng (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AIRFLOW-1319) Fix misleading SparkSubmitOperator and SparkSubmitHook docstring
Date Mon, 19 Jun 2017 09:11:00 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Sng updated AIRFLOW-1319:
-------------------------------
    Description: 
In the community-contributed Spark submit hook and operator, it support {{spark-submit}}'s
{{--files}} command line option. The {{--files}} option is used to submit file to each executor
to be used. A good example of such files are serialized objects.

However, in both docstrings (https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/spark_submit_operator.py#L37
and https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_submit_hook.py#L36),
it provided {{hive-site.xml}} as an example. This may mislead less-informed developers into
assuming that Hive configuration files can be submitted to the cluster in this manner. According
to Apache Hive's documentation, hive configuration files are located in the directory located
in the `HIVE_CONF_DIR` environment variable.

I propose excluding this example from the docstrings.

  was:
In the community-contributed Spark submit hook and operator, it support `spark-submit`'s `--files`
command line option. The `--files` option is used to submit file to each executor to be used.
A good example of such files are serialized objects.

However, in both docstrings (https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/spark_submit_operator.py#L37
and https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_submit_hook.py#L36),
it provided `hive-site.xml` as an example. This may mislead less-informed developers into
assuming that Hive configuration files can be submitted to the cluster in this manner. According
to Apache Hive's documentation, hive configuration files are located in the directory located
in the `HIVE_CONF_DIR` environment variable.

I propose excluding this example from the docstrings.


> Fix misleading SparkSubmitOperator and SparkSubmitHook docstring
> ----------------------------------------------------------------
>
>                 Key: AIRFLOW-1319
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1319
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: contrib
>            Reporter: Chris Sng
>            Assignee: Chris Sng
>            Priority: Trivial
>              Labels: documentation
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the community-contributed Spark submit hook and operator, it support {{spark-submit}}'s
{{--files}} command line option. The {{--files}} option is used to submit file to each executor
to be used. A good example of such files are serialized objects.
> However, in both docstrings (https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/spark_submit_operator.py#L37
and https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_submit_hook.py#L36),
it provided {{hive-site.xml}} as an example. This may mislead less-informed developers into
assuming that Hive configuration files can be submitted to the cluster in this manner. According
to Apache Hive's documentation, hive configuration files are located in the directory located
in the `HIVE_CONF_DIR` environment variable.
> I propose excluding this example from the docstrings.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message