airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Barbosa <rrbarb...@gmail.com>
Subject Re: DAG dependencies on different frequencies
Date Tue, 14 Jun 2016 07:47:40 GMT
Hi Chris,

It's not clear to me how to use the ExternalTaskSensor. In the case I
described, as far as I understand, it can be used for the "consumer" to
wait for every instance of the "producer" to finish. However, I would like
the the "consumer" to wait 7 executions of the "producer".

Am I missing something?

Best,
Rafael

Rafael Barbosa

On Tue, Jun 14, 2016 at 5:52 AM, Chris Riccomini <criccomini@apache.org>
wrote:

> An alternative approach would be to use the ExternalTaskSensor to sense
> when the task(s)/data that you depend upon has finished/arrived.
>
>
> https://pythonhosted.org/airflow/code.html#airflow.operators.ExternalTaskSensor
>
> On Mon, Jun 13, 2016 at 9:08 AM, Laura Lorenz <llorenz@industrydive.com>
> wrote:
>
> > We do something like that here here in this way, and I'd love feedback
> from
> > anyone else if they do it a different way. Basically we use
> > TriggerDagRunOperators and a task in the "Producer" that cares what day
> of
> > the week it is being run, and a "Consumer" DAG with no schedule interval
> > that depends on being triggers when the right conditions of the
> "Producer"
> > DAG are met.
> >
> > In more detail, we have a daily DAG like your "Producer" that ingests
> data
> > every day. We want to run a series of tasks including generating a report
> > in a "Consumer" DAG after after Monday's data has been processed; which
> > would be on a Tuesday, after a run of "Producer" off the execution date
> of
> > Monday.
> >
> > The last task in "Producer" is a TriggerDagRunOperator that only triggers
> > IF it notices that now() is Tuesday. This triggers an instance at the
> right
> > time of a child DAG of the "Consumer" type.
> >
> >
> > On Mon, Jun 13, 2016 at 11:50 AM, Rafael Barbosa <rrbarbosa@gmail.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > I tried looking around, but I cannot find an archive of this mailing
> > list,
> > > so I apologize if this question has been asked before.
> > >
> > > I am looking for a why to set the dependency between two DAGs: a
> > "Producer"
> > > runs every day, appending data in a DB and a "Consumer" that runs
> weekly,
> > > generating a summary report of the new data.
> > >
> > > I would like to run the "Consumer" every Monday (for instance), but
> only
> > > after the Monday instance of the daily "Producer" is complete. What
> would
> > > be the best way to accomplish this?
> > >
> > > Thanks,
> > > Rafael
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message