airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Tong <cix...@gmail.com>
Subject Re: schedule_interval question
Date Thu, 18 Apr 2019 14:27:28 GMT
Do not set to datetime.now(). You could set to 2019-04-18 and it will start
scheduling at 2019-04-18 2 AM.

Chen

On Thu, Apr 18, 2019, 08:55 Pawel Bartoszek <pawel.bartoszek.bbc@gmail.com>
wrote:

> Ash, If I omit start_date it I get the error
> Task is missing the start_date parameter
>
> What should I set it to then?
>
> On Thu, Apr 18, 2019 at 1:03 PM Ash Berlin-Taylor <ash@apache.org> wrote:
>
> > Do not set start_date to now. That will _always_ be wrong.
> > https://airflow.apache.org/faq.html#what-s-the-deal-with-start-date
> >
> > > On 18 Apr 2019, at 12:13, Pawel Bartoszek <
> pawel.bartoszek.bbc@gmail.com>
> > wrote:
> > >
> > > Hi,
> > >
> > > When I set start_date to datetime.now() ie
> > >
> > > DAG(
> > >        dag_id="dag",
> > >        start_date=datetime.now(),
> > >        schedule_interval="0 2 * * *",
> > >        default_view="graph",
> > >        orientation="TB",
> > >        concurrency=1,
> > >        max_active_runs=1,
> > >        catchup=False
> > > )
> > >
> > > I get following info in task instance details
> > >
> > > DependencyReason
> > > Execution Date The execution date is 2019-04-18T11:09:16.193396+00:00
> but
> > > this is before the task's start date 2019-04-18T11:10:42.607861+00:00.
> > > Execution Date The execution date is 2019-04-18T11:09:16.193396+00:00
> but
> > > this is before the task's DAG's start date
> > 2019-04-18T11:10:42.607861+00:00.
> > > Dagrun Running Task instance's dagrun did not exist: Unknown reason.
> > >
> > > I though execution date should be set to 2019-04-19 02:00 ?
> > >
> > >
> > > On Wed, Apr 17, 2019 at 8:37 PM Chao-Han Tsai <milton0825@gmail.com>
> > wrote:
> > >
> > >> Hi Pawel,
> > >>
> > >> I think you can change the start_date to later dates to avoid the
> > DagRun of
> > >> 2019-04-16 02:00 being scheduled.
> > >>
> > >> Chao-Han
> > >>
> > >> On Wed, Apr 17, 2019 at 10:13 AM Pawel Bartoszek <
> > >> pawel.bartoszek.bbc@gmail.com> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> Let's say I deploy the following DAG at 2019-04-17 5 PM
> > >>>
> > >>> DAG(
> > >>>        dag_id="dag",
> > >>>        start_date=datetime(year=2018, month=1, day=1, hour=2,
> > minute=0),
> > >>>        schedule_interval="0 2 * * *,
> > >>>        default_view="graph",
> > >>>        orientation="TB",
> > >>>        concurrency=1,
> > >>>        max_active_runs=1,
> > >>>        catchup=False)
> > >>>
> > >>>
> > >>> I noticed that DAG will be first scheduled for yesterday ie
> 2019-04-16
> > 2
> > >>> AM. How can I avoid this? I want the DAG to be scheduled in the
> future
> > >>> according to the cron expression ie 2019-04-18 2 AM.
> > >>>
> > >>> Setting schedule_interval as
> > >>>
> > >>> schedule_interval=timedelta(hours=24),
> > >>>
> > >>> correct me if I am wrong but Airflow seems to schedule DAG 24 hours
> in
> > >> the
> > >>> past from the time DAG was deployed.
> > >>>
> > >>> Thanks,
> > >>> Pawel
> > >>>
> > >>
> > >>
> > >> --
> > >>
> > >> Chao-Han Tsai
> > >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message