airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raphael Lopez Kaufman (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-1398) Add ability for ExternalTaskSensor to wait on multiple runs of a task
Date Tue, 11 Jul 2017 08:14:00 GMT
Raphael Lopez Kaufman created AIRFLOW-1398:
----------------------------------------------

             Summary: Add ability for ExternalTaskSensor to wait on multiple runs of a task
                 Key: AIRFLOW-1398
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1398
             Project: Apache Airflow
          Issue Type: Improvement
            Reporter: Raphael Lopez Kaufman


Currently using the execution_date_fn parameter of the ExternalTaskSensor sensors only allows
to wait for the completion of one given run of the task the ExternalTaskSensor is sensing.

However, this prevents users to have setups where dags don't have the same schedule frequency
but still depend on one another. For example, let's say you have a dag scheduled hourly that
transforms log data and is owned by the team in charge of logging. In the current setup you
cannot have other higher level teams, that want to use this transformed data, create dags
processing transformed log data in daily batches, while making sure the logged transformed
data was properly created. Note that simply waiting for the data to be present (using e.g.
the HivePartitionSensor if the data is in hive) might not be satisfactory because the data
being present doesn't mean it is ready to be used.

Adding the ability for an ExternalTaskSensor to wait for multiple runs of the task it is sensing
to have finished would allow higher level teams to setup dags with an ExternalTaskSensor sensing
the end task of the dag that transforms the log data and to wait for the successful completion
of 24 of its hourly runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message