airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Edwards <...@bidnamic.com>
Subject Handling look back periods in data sources
Date Fri, 26 Apr 2019 17:17:29 GMT
I was wondering if anyone has a pattern for handling look back periods
when ingesting data. One of our sources attributes cost up to 90 days
in arrears. So currently I am ingesting @daily but re-writing the last
90 days with depends_on_past=True on my operator. This feels wrong as
I am effecting data outside of my schedule interval.

It also makes backfilling slow as I have to ingest more data than is
required. I could make my operator smarter and only re-write for the
latest run. But maybe there is a cleaner solution to this?

Thanks for your help,

Rob

Mime
View raw message