airflow-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Potiuk <ja...@potiuk.com>
Subject [DISCUSS] Providers/Airflow versioning - consequences (Dragons ahead!)
Date Sun, 13 Jun 2021 15:21:23 GMT
Dear Airflow community,

I would like to follow up a bit after the plan of releasing new providers
(I hope tomorrow) with Airflow 2.1.

I briefly mentioned it in [1] but some later discussions in slack/issues
made me think that I need to spell out some of the non-obvious consequences
that people who manage Airflow should be aware of.

The way how providers and Airflow versions play together might not be
obvious. I explained some of that already in my medium post: [2] but here
is a "gist" of how you should deal with the airflow/providers installation.
After some comments and possibly discussion I might turn it into another
page in our documentation if people will think it is useful.

1. Using Airflow DockerHub reference images

The reference images [3] provide preinstalled airflow with a predefined set
of extras (including some common providers). Dependencies are installed
with the "Golden" set for that version - the same that is captured in the
versioned constraint files. This means that if new providers are released
later, you need to upgrade providers in the image if you want to have them.
You can do it via extension or customisation mechanisms. For example you
might want to build your image with this Dockerfile if you want to upgrade
to latest version of the google provider:

FROM  apache:airflow:2.1.0-python3.8
RUN pip install --upgrade apache-airflow-providers-google

When we release a new version of image, it brings a new "Golden" set of
providers for that version (with potentially breaking changes for some
providers !).  If the breaking change with the new providers is not good
for you, you can still downgrade to previous versions of those providers.
For example the Dockerfile below will downgrade google provider to 3.0.0 if
you do not want to use the latest version of the google provider because it
has some breaking changes. So you can still deal with those breaking
changes in a nice way:

FROM  apache:airflow:2.2.0-python3.8
RUN pip install apache-airflow-providers-google==3.0.0

2. Installing Airflow on your own

As described in our installation manual [4] Use constraint files and extras
to install providers. This will install the "golden" set of dependencies
for any combination of extras that you might want. It will install without
conflicts. Then you can manually continue updating airflow, airflow
dependencies and providers WITHOUT using constraint files. Constraint files
fix the dependencies and if you use them, you cannot upgrade those :). If
you just want to upgrade specific providers or dependencies you can do it
the usual way `pip install --upgrade apache-airflow-providers-http` will
for example upgrade the http provider to the latest version of the http
provider COMPATIBLE with the airflow version you installed (see more about
it below)

You can still use constraints if you want to bring a new version of airflow
and the new "Golden" set of dependencies for that version - so you might
want to repeat the "pip install --upgrade" (or downgrade) to some previous
versions of providers and dependencies regardless of what is in the new
"Golden" set. So if you find some breaking changes you can still manage it
on your own.

3. Installing airflow providers via == (warning, Dragons ahead!)

After we release the providers this week, there is one potential scenario
of installing providers that might have undesired consequences (for both
local and image installation). The new providers will have `airflow >=
2.1.0' install requirements, This means that they are not compatible with
Airflow 2.0.* and you MUST upgrade Airflow to 2.1+ first if you want to use
the new providers. If you have airflow 2.0.* and you install such a
provider with == (for example `pip install
apache-airflow-providers-google==4.0.0`) - it will AUTOMATICALLY upgrade
Airflow to the latest released version >= 2.1.0 (both if you install it
locally and in the image). This might be undesired, seem like an accidental
side-effect and in most cases it will require you to run `airflow db
upgrade` afterwards.

4. Installing providers for managed services

If you run a managed service and you allow people to add their own
requirements, the right way of installing a "fixed version" of a
provider is via `pip install provider==NEW_PROVIDER_VERSION
airflow==YOUR_INSTALLED_AIRFLOW_VERSION`. This allows pip to determine if
the provider is compatible with your Airflow version and either install the
provider (if it is) or fail with a conflict message (if it is not). It will
never upgrade Airflow "accidentally".

Happy to discuss this and answer any questions you might have. If you have
any comments, worries etc. - feel free to discuss it here. Let me know if
that explanation is useful for you, or whether things are unclear or maybe
can be better worded.

J.

[1] Upcoming provider's release and Airflow 2.1+ compatibility thread
https://lists.apache.org/thread.html/rdf7ade453f9f90f84b1a84bf00777aab7b4ebceaf58992001556cec4%40%3Cdev.airflow.apache.org%3E

[2] Airflow 2.0 providers
https://medium.com/apache-airflow/airflow-2-0-providers-1bd21ba3bd93

[3] Building images https://airflow.apache.org/docs/docker-stack/build.html

[4] Constraint files
https://airflow.apache.org/docs/apache-airflow/stable/installation.html#constraints-files

-- 
+48 660 796 129

Mime
View raw message