airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Machado <pe...@205datalab.com>
Subject Dealing with data latency
Date Mon, 04 Jun 2018 12:46:31 GMT
Hi. What is the recommended way to deal with data latency? For example, I
have a feed that is not considered final until 72 hours have passed after
the end of the daily period.

For example, Monday's data would be ready by Thursday at 23:59.

Should I pull data based on the execution date minus a 72 hour offset or
use the execution date and somehow delay the data pull for 72 hours?

The latter would be more intuitive (data pull date = execution date) but I
am not sure if it's a good pattern.

Thanks,

Pedro

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message