airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Potiuk <Jarek.Pot...@polidea.com>
Subject [DISCUSS] Back to (some) dependency pinning
Date Mon, 24 Jun 2019 09:00:03 GMT
With the recent Sphinx problem
<https://issues.apache.org/jira/browse/AIRFLOW-4841>- we got back our
old-time enemy. In this case sphinx autoapi has been released yesterday to
1.1.0 version and it started to caused our master to fail, causing kind of
emergency rush to fix as master (and all PRs based on it) would be broken.

I think I have a proposal that can address similar problems without pushing
us in emergency mode.

*Context:*

I wanted to return back to an old discussion - how we can avoid unrelated
dependencies to cause emergencies on our side where we have to quickly
solve such dependency issues when they break our builds.

*Change coming soon:*

The problems will be partially addressed with last stage of AIP-10 (
https://github.com/apache/airflow/pull/4938 - pending only Kubernetes test
fix). It effectively freezes installed dependencies as cached layer of
docker image for builds which do not touch setup.py - so in case setup.py
does not change, the dependencies will not be updated to latest ones.

*Possibly even better long-term solution:*

I think we should address it a bit better. We had a number of discussions
on pinning dependencies (for example here
<https://lists.apache.org/thread.html/9e775d11cce6a3473cbe31908a17d7840072125be2dff020ff59a441@%3Cdev.airflow.apache.org%3E>).
I think the conclusion there was that airflow is both "library" (for DAGs)
- where dependencies should not be pinned and end-product (where the
dependencies should be pinned). So it's a bit catch-22 situation.

Looking at the problem with Sphinx however It came to me that maybe we can
use hybrid solution. We pin all the libraries (like Sphinx or Flask) that
are used to merely build and test the end product but we do not pin the
libraries (like google-api) which are used in the context of library
(writing the operators and DAGs).

What do you think? Maybe that will be the best of both worlds ? Then we
would have to classify the dependencies and maybe restructure setup.py
slightly to have an obvious distinction between those two types of
dependencies.

J.

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message