airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <>
Subject [jira] [Commented] (AIRFLOW-1770) Add option to file and hiveconfs in HiveOperator
Date Fri, 12 Jan 2018 10:02:00 GMT


ASF subversion and git services commented on AIRFLOW-1770:

Commit eb994d683f244f63dd191a6640baaee66ffc8e29 in incubator-airflow's branch refs/heads/master
from [~alanma]
[;h=eb994d6 ]

[AIRFLOW-1770] Allow HiveOperator to take in a file

Clarify and upgrade HiveOperator. Include
description of hql parameter being able to
take in a relative path from the dag file
of a hive script, templated or not. Add
ability to template hiveconf variables. Add
default value to the map reduce job name as
well as add updated hiveconf var for queue.

Closes #2752 from wolfier/AIRFLOW-1770

> Add option to file and hiveconfs in HiveOperator
> ------------------------------------------------
>                 Key: AIRFLOW-1770
>                 URL:
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Ace Haidrey
>            Assignee: Alan Ma
>              Labels: operators
>             Fix For: 1.10.0, 1.9.1, 2.0.0
> The HiveOperator as it currently stands is not flexible enough to accept a hive file
and operate on that. You need to read in the contents and pass it and if you do that you need
to change the way hiveconfs are in your file to jinja templating.
> Many teams already have their existing sql/hql files and don't want to convert them to
make them as portable and decoupled as possible.
> To accomplish this all we need to do is add the option to pass a hql_file and hiveconfs
to the HiveOperator. We change the code in the execute to throw an error if both a hql_file
and an hql statement are passed. If just hql_file the simplest way without changing the code
of the hive hook is to just read the content of the hql_file and set it to be the self.hql.
The hiveconfs get passed directly to the run_cli method and we can combine them with the already
passed in hiveconfs.
> If we want to make it optional to pass in the context as hiveconfs we can add that too
as related to AIRFLOW-788.
> I've included some simple tests to show it all works how we expect.

This message was sent by Atlassian JIRA

View raw message