airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <>
Subject Re: ETL best practices for airflow
Date Sun, 16 Oct 2016 22:40:05 GMT
I really look forward to it, Gerard! I've read what you you wrote so far
and I really liked it - please keep up the great job!

I am hoping to see some best practices for the design of incremental loads
and using timestamps from source database systems (not being on UTC so
still confused about it in Airflow). Also practical use of subdags and
dynamic generation of tasks using some external metadata (maybe describe in
details something similar that wepay did

On Sun, Oct 16, 2016 at 5:23 PM, Gerard Toonstra <>

> Hi all,
> About a year ago, I contributed the HTTPOperator/Sensor and I've been
> tracking airflow since. Right now it looks like we're going to adopt
> airflow at the company I'm currently working at.
> In preparation for that, I've done a bit of research work how airflow
> pipelines should fit together, how important ETL principles are covered and
> decided to write this up on a documentation site. The airflow documentation
> site contains everything on how all airflow works and the constructs that
> you have available to build pipelines, but it can still be a challenge for
> newcomers to figure out how to put those constructs together to use it
> effectively.
> The articles I found online don't go into a lot of detail either. Airflow
> is built around an important philosophy towards ETL and there's a risk that
> newcomers simply pick up a really great tool and start off in the wrong way
> when using it.
> This weekend, I set off to write some documentation to try to fill this
> gap. It starts off with a generic understanding of important ETL principles
> and I'm currently working on a practical step-by-step example that adheres
> to these principles with DAG implementations in airflow; i.e. showing how
> it can all fit together.
> You can find the current version here:
> Looking forward to your comments. If you have better ideas how I can make
> this contribution, don't hesitate to contact me with your suggestions.
> Best regards,
> Gerard

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message