airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Leslie-Waksman <waks...@gmail.com>
Subject Re: Catchup By default = False vs LatestOnlyOperator
Date Mon, 23 Jul 2018 21:03:38 GMT
Ok, not so fringe; I'm glad it's working well for your use case, James.

I retract my suggestion of deprecation.

On Mon, Jul 23, 2018 at 12:58 PM James Meickle
<jmeickle@quantopian.com.invalid> wrote:

> We use LatestOnlyOperator in production. Generally our data is available on
> a regular schedule, and we update production services with it as soon as it
> is available; we might occasionally want to re-run historical days, in
> which case we want to run the same DAG but without interacting with live
> production services at all.
>
> On Mon, Jul 23, 2018 at 2:18 PM, George Leslie-Waksman <waksman@gmail.com>
> wrote:
>
> > As the author of LatestOnlyOperator, the goal was as a stopgap until
> > catchup=False landed.
> >
> > There are some (very) fringe use cases where you might still want
> > LatestOnlyOperator but in almost all cases what you want is probably
> > catchup=False.
> >
> > The situations where LatestOnlyOperator is still useful are where you
> want
> > to run most of your DAG for every schedule interval but you want some of
> > the tasks to run only on the latest run (not catching up, not
> backfilling).
> >
> > It may be best to deprecate LatestOnlyOperator at this point to avoid
> > confusion.
> >
> > --George
> >
> > On Sat, Jul 21, 2018 at 7:34 PM Ben Tallman <btallman@gmail.com> wrote:
> >
> > > As the author of catch-up, the idea is that in many cases your data
> > > doesn't "window" nicely and you want instead to just run as if it were
> a
> > > brilliant Cron...
> > >
> > > Ben
> > >
> > > Sent from my iPhone
> > >
> > > > On Jul 20, 2018, at 11:39 PM, Shah Altaf <mendhak@gmail.com> wrote:
> > > >
> > > > Hi my understanding is: if you use the LatestOnlyOperator then when
> you
> > > run
> > > > the DAG for the first time you'll see a whole bunch of DAG runs
> queued
> > > up,
> > > > and in each run the LatestOnlyOperator will cause the rest of the DAG
> > run
> > > > to be skipped.  Only the latest DAG will run in 'full'.
> > > >
> > > > With catchup = False, you should just get just the latest DAG run.
> > > >
> > > >
> > > > On Fri, Jul 20, 2018 at 10:58 PM Shubham Gupta <
> > > shubham180695.sg@gmail.com>
> > > > wrote:
> > > >
> > > >> ---------- Forwarded message ---------
> > > >> From: Shubham Gupta <shubham180695.sg@gmail.com>
> > > >> Date: Fri, Jul 20, 2018 at 2:38 PM
> > > >> Subject: Catchup By default = False vs LatestOnlyOperator
> > > >> To: <dev-subscribe@airflow.incubator.apache.org>
> > > >>
> > > >>
> > > >> Hi!
> > > >>
> > > >> Can someone please explain the difference b/w catchup by default =
> > False
> > > >> and LatestOnlyOperator?
> > > >>
> > > >> Regarding
> > > >> Shubham Gupta
> > > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message