airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ash Berlin-Taylor <ash_airflowl...@firemirror.com>
Subject Backwards compability - what do we mean? when? how long?
Date Tue, 19 Dec 2017 18:45:29 GMT
Hi,

A question came up on a github issue about what exactly we meant about backwards compatibility,
and I figured we as a project should work out what we mean when we say we want to maintain
compat. And most importantly document it (don't worry, I'm volunteering to do bit, so long
as we reach consensus).

So the issue that spawned this was https://github.com/apache/incubator-airflow/pull/2806 <https://github.com/apache/incubator-airflow/pull/2806>
which changes some config setting names.

In the example of this particular PR it's not a big deal, and supporting both names is not
a difficult change, but I felt this was a discussion worth having.

My view is somewhat influenced by Django which has a view that if you have no deprecation
warnings now (say on 1.8) then if you upgrade to 1.9 things will still work, but now you might
get deprecation warnings that will need to be fixed before upgrading to 1.10. (Version numbers
just for example.)

Bolke's view is: "backwards compatible is more along the lines of FreeBSD. API/ABI compatible,
but config changes can happen and are in UPDATING"

Now, Airflow is still a relatively young project so perhaps we want to be a little more relaxed
about this for the time being? Do we want to just say backwards compatibility is required
for operators and hooks -- i.e. things that most users are going to interact with, but other
bits of internals are "fair game to change", but perhaps with a best-effort goal.

So some questions:

1) Do we want to follow something like SemVer (semver.org <http://semver.org/>) where
backwards-incompatible changes need a major version bump, which would be to Airflow 2.0 in
this case. I think this is what we do, but I don't believe is written down anywhere.

2) Do we want to limit or exclude the back-compart _guarantees_ to any areas? ie. just include
Hooks and Operators? What about data model/DB tables? Log formats? Log file paths?

Does anyone have any other strong opinions about areas I haven't mentioned?

My answers are:
1) Yes to SemVer
2) Strong Yes to Hooks and Operators and anything used directly in DAGs, with a weak yes to
including config and other python classes too.

My aim here is to start some discussion. If we get to any consensus then after the holidays
I'll open a PR to update the docs.

Cheers,
-ash



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message