airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Lam <>
Subject Making Airflow Fault-Tolerant when running Airflow on Kubernetes
Date Wed, 12 Sep 2018 20:08:33 GMT
Hi all,

We currently run Airflow as a Deployment in a kubernetes cluster. We also
use a variant of KubernetesOperator to run our DAGs.

We are investigating how to best make Airflow fault-tolerant, in part, due
to investigating the use of preemptible vms [1]. *Has there been much
discussion about about how to deploy Airflow in a fault-tolerant way? Are
there any best practices? Ideally we'd like our kubernetes-hosted Airflow
to support rolling updates for Docker image updates and also recover from
components (worker, scheduler, web) going down temporarily, including when
DAGs are in flight. *

Any advice, ideas and/or feedback appreciated!


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message