airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Riccomini <criccom...@apache.org>
Subject Re: Airflow 2.0
Date Fri, 18 Nov 2016 21:32:36 GMT
Full-fledged REST API (that the UI also uses) would be great in 2.0.

On Fri, Nov 18, 2016 at 6:26 AM, David Kegley <kegs@b23.io> wrote:
> Hi All,
>
> We have been using Airflow heavily for the last couple months and it’s been great so
far. Here are a few things we’d like to see prioritized in 2.0.
>
> 1) Role based access to DAGs:
> We would like to see better role based access through the UI. There’s a related ticket
out there but it hasn’t seen any action in a few months
> https://issues.apache.org/jira/browse/AIRFLOW-85
>
> We use a templating system to create/deploy DAGs dynamically based on some directory/file
structure. This allows analysts to quickly deploy and schedule their ETL code without having
to interact with the Airflow installation directly. It would be great if those same analysts
could access to their own DAGs in the UI so that they can clear DAG runs, mark success, etc.
while keeping them away from our core ETL and other people's/organization's DAGs. Some of
this can be accomplished with ‘filter by owner’ but it doesn’t address the use case
where a DAG can be maintained by multiple users in the same organization when they have separate
Airflow user accounts.
>
> 2) An option to turn off backfill:
> https://issues.apache.org/jira/browse/AIRFLOW-558
> For cases where a DAG does an insert overwrite on a table every day. This might be a
realistic option for the current version but I just wanted to call attention to this feature
request.
>
> Best,
> David
>
> On Nov 17, 2016, at 6:19 PM, Maxime Beauchemin <maximebeauchemin@gmail.com<mailto:maximebeauchemin@gmail.com>>
wrote:
>
> *This is a brainstorm email thread about Airflow 2.0!*
>
> I wanted to share some ideas around what I would like to do in Airflow 2.0
> and would love to hear what others are thinking. I'll compile the ideas
> that are shared in this thread in a Wiki once the conversation fades.
>
> -------------------------------------------
>
> First idea, to get the conversation started:
>
> *Breaking down the package*
> `pip install airflow-common airflow-scheduler airflow-webserver
> airflow-operators-googlecloud ...`
>
> It seems to me like we're getting to a point where having different
> repositories and different packages would make things much easier in all
> sorts of ways. For instance the web server is a lot less sensitive than the
> scheduler, and changes to operators should/could be deployed at will,
> independently from the main package. People in their environment could
> upgrade only certain packages when needed. Travis builds would be more
> targeted, and take less time, ...
>
> Also, the whole current "extra_requires" approach to optional dependencies
> (in setup.py) is kind getting out-of-hand.
>
> Of course `pip install airflow` would bring in a collection of sub-packages
> similar in functionality to what it does now, perhaps without so many
> operators you probably don't need in your environment.
>
> The release process is the main pain-point and the biggest risk for the
> project, and I feel like this a solid solution to address it.
>
> Max
>

Mime
View raw message