From dev-return-6841-archive-asf-public=cust-asf.ponee.io@airflow.incubator.apache.org Tue Oct 16 17:24:35 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 31D71180649 for ; Tue, 16 Oct 2018 17:24:33 +0200 (CEST) Received: (qmail 54198 invoked by uid 500); 16 Oct 2018 15:24:32 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 54185 invoked by uid 99); 16 Oct 2018 15:24:31 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Oct 2018 15:24:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 5E3B4C1980 for ; Tue, 16 Oct 2018 15:24:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -2.161 X-Spam-Level: X-Spam-Status: No, score=-2.161 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, KAM_LOTSOFHASH=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_DKIMWL_WL_HIGH=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=wepay.com header.b=BwOShZoE; dkim=pass (1024-bit key) header.d=wepay.com header.b=Wfa7TjBX Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id S9086kF7XVfe for ; Tue, 16 Oct 2018 15:24:21 +0000 (UTC) Received: from us-smtp-delivery-167.mimecast.com (us-smtp-delivery-167.mimecast.com [63.128.21.167]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 9976C5F205 for ; Tue, 16 Oct 2018 15:24:21 +0000 (UTC) Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) (Using TLS) by us-smtp-1.mimecast.com with ESMTP id us-mta-286-vS73_aEVNwug7lFKtZbQ4Q-1; Tue, 16 Oct 2018 11:24:14 -0400 X-MC-Unique: vS73_aEVNwug7lFKtZbQ4Q-1 Received: by mail-pl1-f200.google.com with SMTP id 43-v6so18735805ple.19 for ; Tue, 16 Oct 2018 08:24:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=p2LMiRacytDiM3vaP5hjrPHW+pIfyux3oZNNnwprL7w=; b=eC7FQmUZV77JOb+yfLFKaZ6MeO3AvtWlHP6hgAfiIsoOZfqSvyYPOXkL7dJ348qYqu veRpijDgafaj5TB/7ISybPWTDDpl0M29bOlmPKPsj8xdPit3k5MWM9qjKy25Mq5Ve0Lw dzizNMBVe1LyqM6BxGYQ4DRb2OfTv/BV8OjvCsVut/DfPIJMDLQElf33fQxK5XC1Oyim gD0wcruQc8wIAQmqk5/c62cNJkl6eegYu0F3gKGt5CczH5LJuKm9vYXbxxxSpE875nDU uU7567cbJO0qeazpnwPZ9j0uTPsRhQqWb1DOmty49WqI7H//KQgOyrpoPEJ4o/S+Iuyh Yt7g== X-Gm-Message-State: ABuFfoiPPw4rP/9JcXMXd85ZE4znZEz6rodDi4f9baJbPAhiNKJXRjzs AOqR0C7l1SBqeFTKYzliTEgwoGIZX8QnV94hm6c0QmWD6QbEyxsFXo3TUQJ+hqLS2ADAWK14bnJ UwcBQZpwUnNo6kdJrukOEDhOLKstdhdGpmispLiuD/BHq X-Received: by 2002:a17:902:8609:: with SMTP id f9-v6mr21958867plo.134.1539703452297; Tue, 16 Oct 2018 08:24:12 -0700 (PDT) X-Google-Smtp-Source: ACcGV61vqcv9VqqvgvRDk/MorIEaX0SmcrcOzPBR1vNLY9EkCkCz5S+bLbZYID9eHQvWhaV/rv845yV1tobgmBTguCI= X-Received: by 2002:a17:902:8609:: with SMTP id f9-v6mr21958802plo.134.1539703451378; Tue, 16 Oct 2018 08:24:11 -0700 (PDT) MIME-Version: 1.0 References: <7291F22E-76D3-4A4B-9090-C34C3F5D3A08@apache.org> <42978BFB-78E1-4233-B860-744397B67429@apache.org> In-Reply-To: From: William Pursell Date: Tue, 16 Oct 2018 09:23:24 -0600 Message-ID: Subject: Re: Pinning dependencies for Apache Airflow To: dev@airflow.incubator.apache.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I'm jumping in a bit late here, and perhaps have missed some of the discussion, but I haven't seen any mention of the fact that pinning versions in setup.py isn't going to solve the problem. Perhaps it's my lack of experience with pip, but currently pip doesn't provide any guarantee that the version of a dependency specified in setup.py will be the version that winds up being installed. Is this a known issue that is being intentionally ignored because it's hard (and out of scope) to solve? I agree that versions should be pinned in setup.py for stable releases, but I think we need to be aware that this won't solve the problem. On Tue, Oct 16, 2018 at 3:18 AM Daniel (Daniel Lamblin) [BDP - Seoul] wrote: > > That Slack comment is mine, thanks. > > If it's a vote my vote is: please limit the package versions in setup.py = for any branch meant to be stable. > > To be specific: > * I don't have an expectation that installing from master is going to wor= k every time. But when it doesn't I do expect to find the CI is broken ther= e's a "red" indicator there-of, even if there was no commit or everyone was= on vacation, it should be running a couple times a day to catch breakage d= ue to dependencies. So, I don't really care if packages in setup.py from ma= ster are pinned, or just unlimited minimums, Though I'd think that any vers= ion can be limited to less than the next major number=E2=80=A6 > * I do have an expectation that installing v1-10-stable, or v1-8-stable, = vX-Y-stable etc. is 100% going to work every time, I do think that its pack= age versions should be the same as those that were used to pass the release= check-off process. > > It is probably easiest for maintainers (us?) if when prepping a stable br= anch, the setup.py is modified to specify exactly the =3D=3D package versio= ns of absolutely everything that passed the test, QA, release process. If w= hat you guys are discussing with pipenv, pip-tools, .lock requirement.txt f= iles etc is integrated with setup.py, then great; otherwise, not good enoug= h (I'll explain). > > I hear there's a concern that say sshtunnel vX.Y turns out to be a securi= ty nightmare you will regret pinning v1-10-stable to sshtunnel-vX.Y once it= 's known, but I disagree this is an big-deal because it will be A) on the u= ser to know that maybe maintainers didn't update this dependency check over= night, B) on the maintainers to go to each _maintained_ stable branch and b= ring it up to date with the patched version and redo the QA & release check= s, and C) on maintainers to mark any stable branch that isn't or can't be u= pdated as "has known security issues", and finally D) have a time line for = marking releases as unmaintained, probably has security issues, we don't ev= en check for. > I doubt the ASF has a problem with maintaining a stable release with secu= rity updates or (as is the current need) fixing a stable release such that = it builds. But I don't know exactly the release rules. > > I think I understand that what I'm proposing is dandy for the git branche= s but is an issue for PyPi, because there you do not update a released vers= ion, and deleting one breaks things even worse, so=E2=80=A6 if 1.10.0 is br= oken and 1.10.1 is the next release in test/development, the fixed 1.10.0 s= hould be released as an update that isn't 1.10.1; something like 1.10.0.1; = Note that HOW 1.10.0 BROKE would not have happened with more careful limits= on the version, possibly requiring full pinning of exact versions. So that= 's again in favor of fixing this stable branches and their releases to exac= t versions that were known good. Now a security patch would still have to b= ecome, say, 1.10.0.1. I don't know if it's possible to go back to a release= d PyPi package and update its readme, description, or any part to mark it a= s known to contain a security issue or not. > > Trying not to go long, but here's the part where I explain why the setup.= py has to be fixed to what passed ci, qa, release etc. for a stable branch = or packaged release, by explaining what bit me: > I made a docker image from a local fork of the v1-10-stable branch a coup= le months ago. A week ago, someone said to me, hey, I want to use the SSHOp= erator but I get this message about paramiko not being installed. So, look = at that, my Dockerfile didn't add `ssh` to the options when running `pip in= stall --no-cache-dir -e "${AIRFLOW_SRC_HOME}/[async,celery,crypto,cgroups,h= dfs,hive,jdbc,ldap,mssql,mysql,postgres,s3,slack,statsd]"`, darn-it. This i= s easy though, let me add that now. You can see how this depends on what's = in setup.py. And you might see how this didn't bring up the log message `pk= g_resources.ContextualVersionConflict: (Click 7.0 (/usr/local/lib/python3.6= /site-packages), Requirement.parse('click=3D=3D6.7'), {'flask-appbuilder'})= ` until the replacement webserver tried to start up, glad I didn't touch th= e scheduler first. > You might think, oh Daniel, you should pull the existing image you releas= ed and just pip install paramiko and pysftp after reading the setup.py file= then release commit that and push it. > Well, because Docker says it=E2=80=99s a best practice to build each rele= ase from the Dockerfile instead of interactively adding layers on top, ther= e's a system that checks (kind of) and (usually) stops that idea from worki= ng. > Also, that would require cautious work, doesn't support the simple fix, a= nd doesn't support people who built the release "late". > What I did end up doing was using pip freeze to figure out exactly what's= on my prior working release (surprise boto3 1.8.6 is there, though master = says boto3 <=3D 1.8.0) and using it as a basis for the Dockerfile prior to = that `pip-install -e`. It's not quite 100% working yet, but the remaining i= ssues have nothing to do with this discussion. > > In summary, I fully expected a stable branch of a package to be able to b= e installed at any time and operate the same way it did when it was cut. I'= m not sure why there's any votes another way about that, but I suspect thos= e votes are more about what goes on at master and on ci than in a release, = and are thus, to my mind, besides the point. > > Thanks, > -Daniel > > =EF=BB=BFOn 10/15/18, 9:05 PM, "Jarek Potiuk" = wrote: > > Speaking of which - just to show what kind of problems we are talking= about > - here is a link to a relevant discussion in troubleshooting @ slack = from > today, where someone tries to install v1.10-stable and needs help. > This is exactly the kind of problems I think are important to solve, > whatever way we choose to solve it: > https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1539573567000100 > > I really don't think it's a good idea to put especially new Airflow u= sers > in this situation where they need to search through devlist and upstr= eam > commits or ask for help to just be able to install stable release of > Airflow. > > J. > > On Mon, Oct 15, 2018 at 9:29 AM Jarek Potiuk > wrote: > > > Sorry for late reply - I was travelling, was at Cloud Next in Londo= n last > > week (BTW. there were talks about Composer/Airflow there). > > > > I see the point, it's indeed very difficult to solve when we want b= oth: > > stability of releases and flexibility of using released version and= write > > the code within it. I think some trade-offs need to be made as we w= on't > > solve it all with a one-size-fits-all approach. Answering your ques= tion > > George - the value of pinning for release purpose is addressing "st= ability" > > need. > > > > - Due to my background I come from the "stability" side (which i= s more > > user-focused) - i.e. the main problem that I want to solve is to= make sure > > that someone who wants to install airflow a fresh and start usin= g it as a > > beginner user, can always run 'pip install airflow' and it will = get > > installed. For me this is the point when many users my simply ge= t put off > > if it refuses to install out-of-the-box. Few months ago I actual= ly > > evaluated airflow to run ML pipeline for startup I was at that t= ime. If > > back then it refused to install out-of-the-box, my evaluation re= sults would > > be 'did not pass the basic criteria'. Luckily it did not happen,= we did > > more elaborated evaluation then - we did not use Airflow eventua= lly but for > > other reasons. For us the criteria "it just works!" was super im= portant - > > because we did not have time to deep dive into details, find out= why things > > do not work - we had a lot of "core/ML/robotics" things to worry= about and > > any hurdles with unstable tools would be a major distraction. We= really > > wanted to write several DAGs and get them executed in stable, re= peatable > > way, and that when we install it on production machine in two mo= nths - it > > continues to work without any extra work. > > - then there are a lot of concerns from the "flexibility" side (= which > > is more advanced users/developers) side. It becomes important wh= en you want > > to actively develop your Dags (you start using more than just bu= ilt-in > > operators and start developing lot more code in DAGs or use Pyth= onOperator > > more and more. Then of course it is important to get the "flexib= le" > > approach. I argue that in this cases the "active" developers mig= ht be more > > inclined to do any tweaking of their environment as they are mor= e advanced > > and might be more experience in the dependencies and would be ab= le to > > downgrade/upgrade dependencies as they will need in their virtua= lenvs. > > Those people should be quite ok with spending a bit more time to= get their > > environment tweaked to their needs. > > > > I was thinking if there is a way to satisfy both ? And I have a wil= d idea: > > > > - we have two set of requirements (easy-upgradeable "stable" one= s in > > requirements.txt/poetry and flexible with versions in setup.py (= or similar) > > - as proposed earlier in this thread > > - we release two flavours of pip-installable airflow: 1.10.1 wit= h > > stable/pinned dependencies and 1.10.1-devel (we can pick other f= lavour > > name) with flexible dependencies. It's quite common to have deve= l releases > > in Linux world - they serve a bit different purpose (like includ= e headers > > for C/C++ programs) and it's usually extra package on top of the= basic one, > > but the basic idea is similar - if you are a user, you install 1= .10.1, if > > you are active developer, you install 1.10.1-devel > > > > What do you think? > > > > Off-topic a bit: a friend of mine pointed me to this excellent talk= by Elm > > creator: "The Hard Parts of Open Source" by Evan Czaplicki > > and it made me thin= k > > differently about the discussion we have :D > > > > J. > > > > On Wed, Oct 10, 2018 at 7:51 PM George Leslie-Waksman > > wrote: > > > >> It's not upgrading dependencies that I'm worried about, it's downg= rading. > >> With upgrade conflicts, we can treat the dependency upgrades as a > >> necessary > >> aspect of the Airflow upgrade. > >> > >> Suppose Airflow pins LibraryA=3D=3D1.2.3 and then a security issue= is found in > >> LibraryA=3D=3D1.2.3. This issue is fixed in LibraryA=3D=3D1.2.4. N= ow, we are > >> placed > >> in the annoying situation of either: a) managing our deployments s= o that > >> we > >> install Airflow first, and then upgrade LibraryA and ignore pip's = warning > >> about incompatible versions, b) keeping the insecure version of Li= braryA, > >> c) waiting for another Airflow release and accepting all other cha= nges, d) > >> maintaining our own fork of Airflow and diverging from mainline. > >> > >> If Airflow specifies a requirement of LibraryA>=3D1.2.3, there is = no problem > >> whatsoever. If we're worried about API changes in the future, ther= e's > >> always LibraryA>=3D1.2.3,1.3 or LibraryA>=3D1.2.3,<2.0 > >> > >> As has been pointed out, that PythonOperator tasks run in the same= venv as > >> Airflow, it is necessary that users be able to control dependencie= s for > >> their code. > >> > >> To be clear, it's not always a security risk but this is not a > >> hypothetical > >> issue. We ran into a code incompatibility with psutil that mattere= d to us > >> but had no impact on Airflow (see: > >> https://github.com/apache/incubator-airflow/pull/3585) and are cur= rently > >> seeing SQLAlchemy held back without any clear need ( > >> https://github.com/apache/incubator-airflow/blob/master/setup.py#L= 325). > >> > >> Pinning dependencies for releases will force us (and I expect othe= rs) to > >> either: ignore/workaround the pinning, or not use Airflow releases= . Both > >> of > >> those options exactly defeat the point. > >> > >> If people are on board with pinning / locking all dependencies for= CI > >> purposes, and we can constrain requirements to ranges for necessar= y > >> compatibility, what is the value of pinning all dependencies for r= elease > >> purposes? > >> > >> --George > >> > >> On Tue, Oct 9, 2018 at 11:57 AM Jarek Potiuk > >> wrote: > >> > >> > I am still not convinced that pinning is bad. I re-read again th= e whole > >> > mail thread and the thread from 2016 > >> > < > >> > > >> https://github.com/apache/incubator-airflow/pull/1809#issuecomment= -257502174 > >> > > > >> > to > >> > read all the arguments, but I stand by pinning. > >> > > >> > I am - of course - not sure about graduation argument. I would j= ust > >> imagine > >> > it might be the cas.. I however really think that situation we a= re in > >> now > >> > is quite volatile. The latest 1.10.0 cannot be clean-installed v= ia pip > >> > without manually tweaking and forcing lower version of flask-app= builder. > >> > Even if you use the constraints file it's pretty cumbersome beca= use > >> you'd > >> > have to somehow know that you need to do exactly that (not at al= l > >> obvious > >> > from the error you get). Also it might at any time get worse as = other > >> > packages get newer versions released. The thing here is that > >> maintainers of > >> > flask-appbuilder did nothing wrong, they simply released new ver= sion > >> with > >> > click dependency version increased (probably for a good reason) = and it's > >> > airflow's cross-dependency graph which makes it incompatible. > >> > > >> > I am afraid that if we don't change it, it's all but guaranteed = that > >> every > >> > single release at some point of time will "deteriorate" and refu= se to > >> > clean-install. If we want to solve this problem (maybe we don't = and we > >> > accept it as it is?), I think the only way to solve it is to har= d-pin > >> all > >> > the requirements at the very least for releases. > >> > > >> > Of course we might choose pinning only for releases (and CI buil= ds) and > >> > have the compromise that Matt mentioned. I have the worry howeve= r (also > >> > mentioned in the previous thread) that it will be hard to mainta= in. > >> > Effectively you will have to maintain both in parallel. And the = case > >> with > >> > constraints is a nice workaround for someone who actually need s= pecific > >> > (even newer) version of specific package in their environment. > >> > > >> > Maybe we should simply give it a try and do Proof-Of-Concept/exp= eriment > >> as > >> > also Fokko mentioned? > >> > > >> > We could have a PR with pinning enabled, and maybe ask the peopl= e who > >> voice > >> > concerns about environment give it a try with those pinned versi= ons and > >> see > >> > if that makes it difficult for them to either upgrade dependenci= es and > >> fork > >> > apache-airflow or use constraints file of pip? > >> > > >> > J. > >> > > >> > > >> > On Tue, Oct 9, 2018 at 5:56 PM Matt Davis = wrote: > >> > > >> > > Erik, the Airflow task execution code itself of course must ru= n > >> somewhere > >> > > with Airflow installed, but if the task is making a database q= uery or > >> a > >> > web > >> > > request or running something in Docker there's separation betw= een the > >> > > environments and maybe you don't care about Python dependencie= s at all > >> > > except to get Airflow running. When running Python operators t= hat's > >> not > >> > the > >> > > case (as you already deal with). > >> > > > >> > > - Matt > >> > > > >> > > On Tue, Oct 9, 2018 at 2:45 AM EKC (Erik Cederstrand) > >> > > wrote: > >> > > > >> > > > This is maybe a stupid question, but is it even possible to = run > >> tasks > >> > in > >> > > > an environment where Airflow is not installed? > >> > > > > >> > > > > >> > > > Kind regards, > >> > > > > >> > > > Erik > >> > > > > >> > > > ________________________________ > >> > > > From: Matt Davis > >> > > > Sent: Monday, October 8, 2018 10:13:34 PM > >> > > > To: dev@airflow.incubator.apache.org > >> > > > Subject: Re: Pinning dependencies for Apache Airflow > >> > > > > >> > > > It sounds like we can get the best of both worlds with the o= riginal > >> > > > proposals to have minimal requirements in setup.py and "guar= anteed > >> to > >> > > work" > >> > > > complete requirements in a separate file. That way we have > >> flexibility > >> > > for > >> > > > teams that run airflow and tasks in the same environment and > >> guidance > >> > on > >> > > a > >> > > > working set of requirements. (Disclaimer: I work on the same= team as > >> > > > George.) > >> > > > > >> > > > Thanks, > >> > > > Matt > >> > > > > >> > > > On Mon, Oct 8, 2018 at 8:16 AM Ash Berlin-Taylor > >> > wrote: > >> > > > > >> > > > > Although I think I come down on the side against pinning, = my > >> reasons > >> > > are > >> > > > > different. > >> > > > > > >> > > > > For the two (or more) people who have expressed concern ab= out it > >> > would > >> > > > > pip's "Constraint Files" help: > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Fpip.pypa.io%2Fen%2Fstable%2Fuser_guide%2F%23constraints-files&data= =3D01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f= 49ee03a4d22a2285684196bb001%7C0&sdata=3DrUqtgC5eVKIQGlzniFMyJpU9IXFZ2Ef= s04ZCgO2I%2F9g%3D&reserved=3D0 > >> > > > > > >> > > > > For example, you could add "flask-appbuilder=3D=3D1.11.1" = in to this > >> > file, > >> > > > > specify it with `pip install -c constraints.txt apache-air= flow` > >> and > >> > > then > >> > > > > whenever pip attempted to install _any version of FAB it w= ould use > >> > the > >> > > > > exact version from the constraints file. > >> > > > > > >> > > > > I don't buy the argument about pinning being a requirement= for > >> > > graduation > >> > > > > from Incubation fwiw - it's an unavoidable artefact of the > >> > open-source > >> > > > > world we develop in. > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Flibraries.io%2F&data=3D01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a4= 65b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&sdata=3DQX5h= O%2FVPJE9M9A38QgCjx%2BfT4C1tfvr1ySUW%2FpV86Jw%3D&reserved=3D0 > >> > > > offers a (free?) service that will monitor apps > >> > > > > dependencies for being out of date, might be better than w= riting > >> our > >> > > own > >> > > > > solution. > >> > > > > > >> > > > > Pip has for a while now supported a way of saying "this de= p is for > >> > > py2.7 > >> > > > > only": > >> > > > > > >> > > > > > Since version 6.0, pip also supports specifiers containi= ng > >> > > environment > >> > > > > markers like so: > >> > > > > > > >> > > > > > SomeProject =3D=3D5.4 ; python_version < '2.7' > >> > > > > > SomeProject; sys_platform =3D=3D 'win32' > >> > > > > > >> > > > > > >> > > > > Ash > >> > > > > > >> > > > > > >> > > > > > On 8 Oct 2018, at 07:58, George Leslie-Waksman < > >> waksman@gmail.com> > >> > > > > wrote: > >> > > > > > > >> > > > > > As a member of a team that will also have really big pro= blems if > >> > > > > > Airflow pins all requirements (for reasons similar to th= ose > >> already > >> > > > > > stated), I would like to add a very strong -1 to the ide= a of > >> > pinning > >> > > > > > them for all installations. > >> > > > > > > >> > > > > > In a number of situation on our end, to avoid similar pr= oblems > >> with > >> > > > > > CI, we use `pip-compile` from pip-tools (also mentioned)= : > >> > > > > > > >> > > > > >> > > > >> > > >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Fpypi.org%2Fproject%2Fpip-tools%2F&data=3D01%7C01%7CEKC%40novozymes.c= om%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C= 0&sdata=3D1d9m%2Bk4NSuXNtnXFRFtv6pGdAUDvVvkoFe95pTshiIQ%3D&reserved= =3D0 > >> > > > > > > >> > > > > > I would like to suggest, a middle ground of: > >> > > > > > > >> > > > > > - Have the installation continue to use unpinned (`>=3D`= ) with > >> > minimum > >> > > > > > necessary requirements set > >> > > > > > - Include a pip-compiled requirements file > >> (`requirements-ci.txt`?) > >> > > > > > that is used by CI > >> > > > > > - - If we need, there can be one file for each incompati= ble > >> python > >> > > > > version > >> > > > > > - Append a watermark (hash of `setup.py` requirements?) = to the > >> > > > > > compiled requirements file > >> > > > > > - Add a CI check that the watermark and original match t= o > >> ensure no > >> > > > > > drift since last compile > >> > > > > > > >> > > > > > I am happy to do much of the work for this, if it can he= lp avoid > >> > > > > > pinning all of the depends at the installation level. > >> > > > > > > >> > > > > > --George Leslie-Waksman > >> > > > > > > >> > > > > > On Sun, Oct 7, 2018 at 1:26 PM Maxime Beauchemin > >> > > > > > wrote: > >> > > > > >> > >> > > > > >> pip-tools can definitely help here to ship a reference = [locked] > >> > > > > >> `requirements.txt` that can be used in [all or part of]= the CI. > >> > It's > >> > > > > >> actually kind of important to get CI to fail when a new > >> [backward > >> > > > > >> incompatible] lib comes out and break things while allo= wing > >> > version > >> > > > > ranges. > >> > > > > >> > >> > > > > >> I think there may be challenges around pip-tools and pr= ojects > >> that > >> > > run > >> > > > > in > >> > > > > >> both python2.7 and python3.6. You sometimes need to hav= e 2 > >> > > > > requirements.txt > >> > > > > >> lock files. > >> > > > > >> > >> > > > > >> Max > >> > > > > >> > >> > > > > >> On Sun, Oct 7, 2018 at 5:06 AM Jarek Potiuk < > >> > > Jarek.Potiuk@polidea.com > >> > > > > > >> > > > > >> wrote: > >> > > > > >> > >> > > > > >>> It's a nice one :). However I think when/if we go to p= inned > >> > > > > dependencies > >> > > > > >>> the way poetry/pip-tools do it, this will be suddenly = lot-less > >> > > useful > >> > > > > It > >> > > > > >>> will be very easy to track dependency changes (they wi= ll be > >> > always > >> > > > > >>> committed as a change in the .lock file or requirement= s.txt) > >> and > >> > if > >> > > > > someone > >> > > > > >>> has a problem while upgrading a dependency (always > >> consciously, > >> > > never > >> > > > > >>> accidentally) it will simply fail during CI build and = the > >> change > >> > > > won't > >> > > > > get > >> > > > > >>> merged/won't break the builds of others in the first p= lace :). > >> > > > > >>> > >> > > > > >>> J. > >> > > > > >>> > >> > > > > >>> On Sun, Oct 7, 2018 at 6:26 AM Deng Xiaodong < > >> > xd.deng.r@gmail.com> > >> > > > > wrote: > >> > > > > >>> > >> > > > > >>>> Hi folks, > >> > > > > >>>> > >> > > > > >>>> On top of this discussion, I was thinking we should h= ave the > >> > > ability > >> > > > > to > >> > > > > >>>> quickly monitor dependency release as well. Previousl= y, it > >> > > happened > >> > > > > for a > >> > > > > >>>> few times that CI kept failing for no reason and even= tually > >> > turned > >> > > > > out it > >> > > > > >>>> was due to dependency release. But it took us some ti= me, > >> > > sometimes a > >> > > > > few > >> > > > > >>>> days, to realise the failure was because of dependenc= y > >> release. > >> > > > > >>>> > >> > > > > >>>> To partially address this, I tried to develop a mini = tool to > >> > help > >> > > us > >> > > > > >>> check > >> > > > > >>>> the latest release of Python packages & the release > >> date-time on > >> > > > PyPi. > >> > > > > >>> So, > >> > > > > >>>> by comparing it with our CI failure history, we may b= e able > >> to > >> > > > > >>> troubleshoot > >> > > > > >>>> faster. > >> > > > > >>>> > >> > > > > >>>> Output Sample (ordered by upload time in desc order): > >> > > > > >>>> Latest Version = Upload > >> > Time > >> > > > > >>>> Package Name > >> > > > > >>>> awscli 1.16.28 > >> > > > > >>> 2018-10-05T23:12:45 > >> > > > > >>>> botocore 1.12.18 > >> > > > > 2018-10-05T23:12:39 > >> > > > > >>>> promise 2.2.1 > >> > > > > >>> 2018-10-04T22:04:18 > >> > > > > >>>> Keras 2.2.4 > >> > > > > >>> 2018-10-03T20:59:39 > >> > > > > >>>> bleach 3.0.0 > >> > > > > >>> 2018-10-03T16:54:27 > >> > > > > >>>> Flask-AppBuilder 1.12.0 > >> > 2018-10-03T09:03:48 > >> > > > > >>>> ... ... > >> > > > > >>>> > >> > > > > >>>> It's a minimal tool (not perfect yet but working). I = have > >> hosted > >> > > > this > >> > > > > >>> tool > >> > > > > >>>> at > >> > > > > >> > > > >> > > >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Fgithub.com%2FXD-DENG%2Fpypi-release-query&data=3D01%7C01%7CEKC%40nov= ozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196= bb001%7C0&sdata=3Dxk9hyQA%2BnaJjqPF7bTQB%2BydqSfGIVzxkynfxjx%2FVoYo%3D&= amp;reserved=3D0 > >> > > > . > >> > > > > >>>> > >> > > > > >>>> > >> > > > > >>>> XD > >> > > > > >>>> > >> > > > > >>>> On Sat, Oct 6, 2018 at 12:25 AM Jarek Potiuk < > >> > > > > Jarek.Potiuk@polidea.com> > >> > > > > >>>> wrote: > >> > > > > >>>> > >> > > > > >>>>> Hello Erik, > >> > > > > >>>>> > >> > > > > >>>>> I understand your concern. It's a hard one to solve = in > >> general > >> > > > (i.e. > >> > > > > >>>>> dependency-hell). It looks like in this case you tre= at > >> Airflow > >> > as > >> > > > > >>>>> 'library', where for some other people it might be m= ore like > >> > 'end > >> > > > > >>>> product'. > >> > > > > >>>>> If you look at the "pinning" philosophy - the "pin > >> everything" > >> > is > >> > > > > good > >> > > > > >>>> for > >> > > > > >>>>> end products, but not good for libraries. In the cas= e you > >> have > >> > > > > Airflow > >> > > > > >>> is > >> > > > > >>>>> treated as a bit of both. And it's perfectly valid c= ase at > >> that > >> > > > (with > >> > > > > >>>>> custom python DAGs being central concept for Airflow= ). > >> > > > > >>>>> However, I think it's not as bad as you think when i= t comes > >> to > >> > > > exact > >> > > > > >>>>> pinning. > >> > > > > >>>>> > >> > > > > >>>>> I believe - a bit counter-intuitively - that tools l= ike > >> > > > > >>> pip-tools/poetry > >> > > > > >>>>> with exact pinning result in having your dependencie= s > >> upgraded > >> > > more > >> > > > > >>>> often, > >> > > > > >>>>> rather than less - especially in complex systems whe= re > >> > > > > dependency-hell > >> > > > > >>>>> creeps-in. If you look at Airflow's setup.py now - I= t's a > >> bit > >> > > scary > >> > > > > to > >> > > > > >>>> make > >> > > > > >>>>> any change to it. There is a chance it will blow at = your > >> face > >> > if > >> > > > you > >> > > > > >>>> change > >> > > > > >>>>> it. You never know why there is 0.3 < ver < 1.0 - an= d if you > >> > > change > >> > > > > it, > >> > > > > >>>>> whether it will cause chain reaction of conflicts th= at will > >> > ruin > >> > > > your > >> > > > > >>>> work > >> > > > > >>>>> day. > >> > > > > >>>>> > >> > > > > >>>>> On the contrary - if you change it to exact pinning = in > >> > > > > >>>>> .lock/requirements.txt file (poetry/pip-tools) and h= ave much > >> > > > simpler > >> > > > > >>> (and > >> > > > > >>>>> commented) exclusion/avoidance rules in your .in/.tm= l file, > >> the > >> > > > whole > >> > > > > >>>> setup > >> > > > > >>>>> might be much easier to maintain and upgrade. Every = time you > >> > > > prepare > >> > > > > >>> for > >> > > > > >>>>> release (or even once in a while for master) one per= son > >> might > >> > > > > >>> consciously > >> > > > > >>>>> attempt to upgrade all dependencies to latest ones. = It > >> should > >> > be > >> > > > > almost > >> > > > > >>>> as > >> > > > > >>>>> easy as letting poetry/pip-tools help with figuring = out what > >> > are > >> > > > the > >> > > > > >>>> latest > >> > > > > >>>>> set of dependencies that will work without conflicts= . It > >> should > >> > > be > >> > > > > >>> rather > >> > > > > >>>>> straightforward (I've done it in the past for fairly= complex > >> > > > > systems). > >> > > > > >>>> What > >> > > > > >>>>> those tools enable is - doing single-shot upgrade of= all > >> > > > > dependencies. > >> > > > > >>>>> After doing it you can make sure that all tests work= fine > >> (and > >> > > fix > >> > > > > any > >> > > > > >>>>> problems that result from it). And then you test it > >> thoroughly > >> > > > before > >> > > > > >>> you > >> > > > > >>>>> make final release. You can do it in separate PR - w= ith > >> > automated > >> > > > > >>> testing > >> > > > > >>>>> in Travis which means that you are not disturbing wo= rk of > >> > others > >> > > > > >>>>> (compilation/building + unit tests are guaranteed to= work > >> > before > >> > > > you > >> > > > > >>>> merge > >> > > > > >>>>> it) while doing it. It's all conscious rather than > >> accidental. > >> > > Nice > >> > > > > >>> side > >> > > > > >>>>> effect of that is that with every release you can ac= tually > >> > > > "catch-up" > >> > > > > >>>> with > >> > > > > >>>>> latest stable versions of many libraries in one go. = It's > >> better > >> > > > than > >> > > > > >>>>> waiting until someone deliberately upgrades to newer= version > >> > (and > >> > > > the > >> > > > > >>>> rest > >> > > > > >>>>> remain terribly out-dated as is the case for Airflow= now). > >> > > > > >>>>> > >> > > > > >>>>> So a bit counterintuitively I think tools like > >> pip-tools/poetry > >> > > > help > >> > > > > >>> you > >> > > > > >>>> to > >> > > > > >>>>> catch up faster in many cases. That is at least my > >> experience > >> > so > >> > > > far. > >> > > > > >>>>> > >> > > > > >>>>> Additionally, Airflow is an open system - if you hav= e very > >> > > specific > >> > > > > >>> needs > >> > > > > >>>>> for requirements, you might actually - in the very s= ame way > >> > with > >> > > > > >>>>> pip-tools/poetry - upgrade all your dependencies in = your > >> local > >> > > fork > >> > > > > of > >> > > > > >>>>> Airflow before someone else does it in master/releas= e. Those > >> > > tools > >> > > > > kind > >> > > > > >>>> of > >> > > > > >>>>> democratise dependency management. It should be as e= asy as > >> > > > > `pip-compile > >> > > > > >>>>> --upgrade` or `poetry update` and you will get all t= he > >> > > > > >>> "non-conflicting" > >> > > > > >>>>> latest dependencies in your local fork (and poetry > >> especially > >> > > seems > >> > > > > to > >> > > > > >>> do > >> > > > > >>>>> all the heavy lifting of figuring out which versions= will > >> > work). > >> > > > You > >> > > > > >>>> should > >> > > > > >>>>> be able to test and publish it locally as your priva= te > >> package > >> > > for > >> > > > > >>> local > >> > > > > >>>>> installations. You can even mark the specific depend= ency you > >> > want > >> > > > to > >> > > > > >>> use > >> > > > > >>>>> specific version and let pip-tools/poetry figure out= exact > >> > > versions > >> > > > > of > >> > > > > >>>>> other requirements. You can even make a PR with such= upgrade > >> > > > > eventually > >> > > > > >>>> to > >> > > > > >>>>> get it faster in master. You can even downgrade in c= ase > >> newer > >> > > > > >>> dependency > >> > > > > >>>>> causes problems for you in similar way. Guided by th= e tools, > >> > it's > >> > > > > much > >> > > > > >>>>> faster than figuring the versions out by yourself. > >> > > > > >>>>> > >> > > > > >>>>> As long as we have simple way of managing it and doc= ument > >> how > >> > to > >> > > > > >>>>> upgrade/downgrade dependencies in your own fork, and= mention > >> > how > >> > > to > >> > > > > >>>> locally > >> > > > > >>>>> release Airflow as a package, I think your case coul= d be > >> > covered > >> > > > even > >> > > > > >>>>> better than now. What do you think ? > >> > > > > >>>>> > >> > > > > >>>>> J. > >> > > > > >>>>> > >> > > > > >>>>> On Fri, Oct 5, 2018 at 2:34 PM EKC (Erik Cederstrand= ) > >> > > > > >>>>> wrote: > >> > > > > >>>>> > >> > > > > >>>>>> For us, exact pinning of versions would be problema= tic. We > >> > have > >> > > > DAG > >> > > > > >>>> code > >> > > > > >>>>>> that shares direct and indirect dependencies with A= irflow, > >> > e.g. > >> > > > > lxml, > >> > > > > >>>>>> requests, pyhive, future, thrift, tzlocal, psycopg2= and > >> ldap3. > >> > > If > >> > > > > our > >> > > > > >>>> DAG > >> > > > > >>>>>> code for some reason needs a newer point release du= e to a > >> bug > >> > > > that's > >> > > > > >>>>> fixed, > >> > > > > >>>>>> then we can't cleanly build a virtual environment > >> containing > >> > the > >> > > > > >>> fixed > >> > > > > >>>>>> version. For us, it's already a problem that Airflo= w has > >> quite > >> > > > > strict > >> > > > > >>>>> (and > >> > > > > >>>>>> sometimes old) requirements in setup.py. > >> > > > > >>>>>> > >> > > > > >>>>>> Erik > >> > > > > >>>>>> ________________________________ > >> > > > > >>>>>> From: Jarek Potiuk > >> > > > > >>>>>> Sent: Friday, October 5, 2018 2:01:15 PM > >> > > > > >>>>>> To: dev@airflow.incubator.apache.org > >> > > > > >>>>>> Subject: Re: Pinning dependencies for Apache Airflo= w > >> > > > > >>>>>> > >> > > > > >>>>>> I think one solution to release approach is to chec= k as > >> part > >> > of > >> > > > > >>>> automated > >> > > > > >>>>>> Travis build if all requirements are pinned with = =3D=3D (even > >> the > >> > > deep > >> > > > > >>>> ones) > >> > > > > >>>>>> and fail the build in case they are not for ALL ver= sions > >> > > > (including > >> > > > > >>>>>> dev). And of course we should document the approach= of > >> > > > > >>>> releases/upgrades > >> > > > > >>>>>> etc. If we do it all the time for development versi= ons > >> (which > >> > > > seems > >> > > > > >>>> quite > >> > > > > >>>>>> doable), then transitively all the releases will al= so have > >> > > pinned > >> > > > > >>>>> versions > >> > > > > >>>>>> and they will never try to upgrade any of the > >> dependencies. In > >> > > > > poetry > >> > > > > >>>>>> (similarly in pip-tools with .in file) it is done b= y > >> having a > >> > > > .lock > >> > > > > >>>> file > >> > > > > >>>>>> that specifies exact versions of each package so it= can be > >> > > rather > >> > > > > >>> easy > >> > > > > >>>> to > >> > > > > >>>>>> manage (so it's worth trying it out I think :D - = seems a > >> bit > >> > > > more > >> > > > > >>>>>> friendly than pip-tools). > >> > > > > >>>>>> > >> > > > > >>>>>> There is a drawback - of course - with manually upd= ating > >> the > >> > > > module > >> > > > > >>>> that > >> > > > > >>>>>> you want, but I really see that as an advantage rat= her than > >> > > > drawback > >> > > > > >>>>>> especially for users. This way you maintain the pro= perty > >> that > >> > it > >> > > > > will > >> > > > > >>>>>> always install and work the same way no matter if y= ou > >> > installed > >> > > it > >> > > > > >>>> today > >> > > > > >>>>> or > >> > > > > >>>>>> two months ago. I think the biggest drawback for > >> maintainers > >> > is > >> > > > that > >> > > > > >>>> you > >> > > > > >>>>>> need some kind of monitoring of security vulnerabil= ities > >> and > >> > > > cannot > >> > > > > >>>> rely > >> > > > > >>>>> on > >> > > > > >>>>>> automated security upgrades. With >=3D requirements= those > >> > security > >> > > > > >>>> updates > >> > > > > >>>>>> might happen automatically without anyone noticing,= but to > >> be > >> > > > honest > >> > > > > >>> I > >> > > > > >>>>>> don't think such upgrades are guaranteed even in cu= rrent > >> setup > >> > > for > >> > > > > >>> all > >> > > > > >>>>>> security issues for all libraries anyway. > >> > > > > >>>>>> > >> > > > > >>>>>> Finding the need to upgrade because of security iss= ues can > >> be > >> > > > quite > >> > > > > >>>>>> automated. Even now I noticed Github started to inf= orm > >> owners > >> > > > about > >> > > > > >>>>>> potential security vulnerabilities in used librarie= s for > >> their > >> > > > > >>> project. > >> > > > > >>>>>> Those notifications can be sent to devlist and turn= ed into > >> > JIRA > >> > > > > >>> issues > >> > > > > >>>>>> followed bvy minor security-related releases (with= only > >> few > >> > > > library > >> > > > > >>>>>> dependencies upgraded). > >> > > > > >>>>>> > >> > > > > >>>>>> I think it's even easier to automate it if you have= pinned > >> > > > > >>>> dependencies - > >> > > > > >>>>>> because it's generally easy to find applicable > >> vulnerabilities > >> > > for > >> > > > > >>>>> specific > >> > > > > >>>>>> versions of libraries by static analysers - when yo= u have > >> >=3D, > >> > > you > >> > > > > >>> never > >> > > > > >>>>>> know which version will be used until you actually = perform > >> the > >> > > > > >>>>>> installation. > >> > > > > >>>>>> > >> > > > > >>>>>> There is one big advantage for maintainers for "pin= ned" > >> case. > >> > > Your > >> > > > > >>>> users > >> > > > > >>>>>> always have the same dependencies - so when issue i= s > >> raised, > >> > you > >> > > > can > >> > > > > >>>>>> reproduce it more easily. It's hard to know which v= ersion > >> user > >> > > has > >> > > > > >>> (as > >> > > > > >>>>> the > >> > > > > >>>>>> user could install it month ago or yesterday) and e= ven if > >> you > >> > > find > >> > > > > >>> out > >> > > > > >>>> by > >> > > > > >>>>>> asking the user, you might not be able to reproduce= the > >> set of > >> > > > > >>>>> requirements > >> > > > > >>>>>> easily (simply because there are already newer vers= ions of > >> the > >> > > > > >>>> libraries > >> > > > > >>>>>> released and they are used automatically). You can = ask the > >> > user > >> > > to > >> > > > > >>> run > >> > > > > >>>>> pip > >> > > > > >>>>>> --upgrade but that's dangerous and pretty lame ("ch= eck the > >> > > latest > >> > > > > >>>>> version - > >> > > > > >>>>>> maybe it fixes your problem ? ") and sometimes not = possible > >> > > (e.g. > >> > > > > >>>> someone > >> > > > > >>>>>> has pre-built docker image with dependencies from f= ew > >> months > >> > ago > >> > > > and > >> > > > > >>>>> cannot > >> > > > > >>>>>> rebuild the image easily). > >> > > > > >>>>>> > >> > > > > >>>>>> J. > >> > > > > >>>>>> > >> > > > > >>>>>> On Fri, Oct 5, 2018 at 12:35 PM Ash Berlin-Taylor < > >> > > ash@apache.org > >> > > > > > >> > > > > >>>>> wrote: > >> > > > > >>>>>> > >> > > > > >>>>>>> One thing to point out here. > >> > > > > >>>>>>> > >> > > > > >>>>>>> Right now if you `pip install apache-airflow=3D1.1= 0.0` in a > >> > clean > >> > > > > >>>>>>> environment it will fail. > >> > > > > >>>>>>> > >> > > > > >>>>>>> This is because we pin flask-login to 0.2.1 but > >> > > flask-appbuilder > >> > > > is > >> > > > > >>>>> =3D > >> > > > > >>>>>>> 1.11.1, so that pulls in 1.12.0 which requires > >> flask-login >=3D > >> > > > 0.3. > >> > > > > >>>>>>> > >> > > > > >>>>>>> So I do think there is maybe something to be said = about > >> > pinning > >> > > > for > >> > > > > >>>>>>> releases. The down side to that is that if there a= re > >> updates > >> > > to a > >> > > > > >>>>> module > >> > > > > >>>>>>> that we want then we have to make a point release = to let > >> > people > >> > > > get > >> > > > > >>>> it > >> > > > > >>>>>>> > >> > > > > >>>>>>> Both methods have draw-backs > >> > > > > >>>>>>> > >> > > > > >>>>>>> -ash > >> > > > > >>>>>>> > >> > > > > >>>>>>>> On 4 Oct 2018, at 17:13, Arthur Wiedmer < > >> > > > > >>> arthur.wiedmer@gmail.com> > >> > > > > >>>>>>> wrote: > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> Hi Jarek, > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> I will +1 the discussion Dan is referring to and = George's > >> > > > advice. > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> I just want to double check we are talking about = pinning > >> in > >> > > > > >>>>>>>> requirements.txt only. > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> This offers the ability to > >> > > > > >>>>>>>> pip install -r requirements.txt > >> > > > > >>>>>>>> pip install --no-deps airflow > >> > > > > >>>>>>>> For a guaranteed install which works. > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> Several different requirement files can be provid= ed for > >> > > specific > >> > > > > >>>> use > >> > > > > >>>>>>> cases, > >> > > > > >>>>>>>> like a stable dev one for instance for people wan= ting to > >> > work > >> > > on > >> > > > > >>>>>>> operators > >> > > > > >>>>>>>> and non-core functions. > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> However, I think we should proactively test in CI= against > >> > > > > >>> unpinned > >> > > > > >>>>>>>> dependencies (though it might be a separate case = in the > >> > > matrix) > >> > > > , > >> > > > > >>>> so > >> > > > > >>>>>> that > >> > > > > >>>>>>>> we get advance warning if possible that things wi= ll > >> break. > >> > > > > >>>>>>>> CI downtime is not a bad thing here, it actually = caught a > >> > > > problem > >> > > > > >>>> :) > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> We should unpin as possible in setup.py to only m= aintain > >> > > minimum > >> > > > > >>>>>> required > >> > > > > >>>>>>>> compatibility. The process of pinning in setup.py= is > >> > extremely > >> > > > > >>>>>>> detrimental > >> > > > > >>>>>>>> when you have a large number of python libraries > >> installed > >> > > with > >> > > > > >>>>>> different > >> > > > > >>>>>>>> pinned versions. > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> Best, > >> > > > > >>>>>>>> Arthur > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> On Thu, Oct 4, 2018 at 8:36 AM Dan Davydov > >> > > > > >>>>>> >> > > > > >>>>>>>> > >> > > > > >>>>>>>> wrote: > >> > > > > >>>>>>>> > >> > > > > >>>>>>>>> Relevant discussion about this: > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>> > >> > > > > >>>>>>> > >> > > > > >>>>>> > >> > > > > >>>>> > >> > > > > >>>> > >> > > > > >>> > >> > > > > > >> > > > > >> > > > >> > > >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Fgithub.com%2Fapache%2Fincubator-airflow%2Fpull%2F1809%23issuecomment-257= 502174&data=3D01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62= d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&sdata=3D9wta3PcUeZjBg%2F= mACBH06cNRzbYG4NcAW0XDJKan6cM%3D&reserved=3D0 > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>> On Thu, Oct 4, 2018 at 11:25 AM Jarek Potiuk < > >> > > > > >>>>>> Jarek.Potiuk@polidea.com> > >> > > > > >>>>>>>>> wrote: > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>>> TL;DR; A change is coming in the way how > >> > > > > >>>> dependencies/requirements > >> > > > > >>>>>> are > >> > > > > >>>>>>>>>> specified for Apache Airflow - they will be fix= ed > >> rather > >> > > than > >> > > > > >>>>>> flexible > >> > > > > >>>>>>>>> (=3D=3D > >> > > > > >>>>>>>>>> rather than >=3D). > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> This is follow up after Slack discussion we had= with > >> Ash > >> > and > >> > > > > >>>> Kaxil > >> > > > > >>>>> - > >> > > > > >>>>>>>>>> summarising what we propose we'll do. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> *Problem:* > >> > > > > >>>>>>>>>> During last few weeks we experienced quite a fe= w > >> downtimes > >> > > of > >> > > > > >>>>>> TravisCI > >> > > > > >>>>>>>>>> builds (for all PRs/branches including master) = as some > >> of > >> > > the > >> > > > > >>>>>>> transitive > >> > > > > >>>>>>>>>> dependencies were automatically upgraded. This = because > >> in > >> > a > >> > > > > >>>> number > >> > > > > >>>>> of > >> > > > > >>>>>>>>>> dependencies we have >=3D rather than =3D=3D d= ependencies. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> Whenever there is a new release of such depende= ncy, it > >> > might > >> > > > > >>>> cause > >> > > > > >>>>>>> chain > >> > > > > >>>>>>>>>> reaction with upgrade of transitive dependencie= s which > >> > might > >> > > > > >>> get > >> > > > > >>>>> into > >> > > > > >>>>>>>>>> conflict. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> An example was Flask-AppBuilder vs flask-login > >> transitive > >> > > > > >>>>> dependency > >> > > > > >>>>>>> with > >> > > > > >>>>>>>>>> click. They started to conflict once AppBuilder= has > >> > released > >> > > > > >>>>> version > >> > > > > >>>>>>>>>> 1.12.0. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> *Diagnosis:* > >> > > > > >>>>>>>>>> Transitive dependencies with "flexible" version= s > >> (where >=3D > >> > > is > >> > > > > >>>> used > >> > > > > >>>>>>>>> instead > >> > > > > >>>>>>>>>> of =3D=3D) is a reason for "dependency hell". W= e will > >> sooner > >> > or > >> > > > > >>> later > >> > > > > >>>>> hit > >> > > > > >>>>>>>>> other > >> > > > > >>>>>>>>>> cases where not fixed dependencies cause simila= r > >> problems > >> > > with > >> > > > > >>>>> other > >> > > > > >>>>>>>>>> transitive dependencies. We need to fix-pin the= m. This > >> > > causes > >> > > > > >>>>>> problems > >> > > > > >>>>>>>>> for > >> > > > > >>>>>>>>>> both - released versions (cause they stop to wo= rk!) and > >> > for > >> > > > > >>>>>> development > >> > > > > >>>>>>>>>> (cause they break master builds in TravisCI and= prevent > >> > > people > >> > > > > >>>> from > >> > > > > >>>>>>>>>> installing development environment from the scr= atch. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> *Solution:* > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> - Following the old-but-good post > >> > > > > >>>>>>>>>> > >> > > > > >>>>>> > >> > > > > >>>>> > >> > > > > >>>> > >> > > > > >>> > >> > > > > > >> > > > > >> > > > >> > > >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Fnvie.com%2Fposts%2Fpin-your-packages%2F&data=3D01%7C01%7CEKC%40novoz= ymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb= 001%7C0&sdata=3D0jqlZcLU6%2BvO%2BJKSMlX7gyix6dKvD%2BZbrgHn9pRknLY%3D&am= p;reserved=3D0 > >> > > > > >>>>>> we are going to fix the > >> > > > > >>>>>>>>>> pinned > >> > > > > >>>>>>>>>> dependencies to specific versions (so basicall= y all > >> > > > > >>>> dependencies > >> > > > > >>>>>> are > >> > > > > >>>>>>>>>> "fixed"). > >> > > > > >>>>>>>>>> - We will introduce mechanism to be able to up= grade > >> > > > > >>>> dependencies > >> > > > > >>>>>> with > >> > > > > >>>>>>>>>> pip-tools ( > >> > > > > >>>>>> > >> > > > > >>>>> > >> > > > > >>>> > >> > > > > >>> > >> > > > > > >> > > > > >> > > > >> > > >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Fgithub.com%2Fjazzband%2Fpip-tools&data=3D01%7C01%7CEKC%40novozymes.c= om%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C= 0&sdata=3Dhu%2FivDsKxwocNlVtBTgYE0E%2BET97u2DWN1IdnCF1ckU%3D&reserv= ed=3D0 > >> > > > > >>>>> ). > >> > > > > >>>>>> We might also > >> > > > > >>>>>>>>> take a > >> > > > > >>>>>>>>>> look at pipenv: > >> > > > > >>>>>> > >> > > > > >>>>> > >> > > > > >>>> > >> > > > > >>> > >> > > > > > >> > > > > >> > > > >> > > >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= %2Fpipenv.readthedocs.io%2Fen%2Flatest%2F&data=3D01%7C01%7CEKC%40novozy= mes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb0= 01%7C0&sdata=3Ds0iqMPk3O8%2Bk1BCPBLYfIIMU2D4SdmPVEYELo%2FKS1%2FA%3D&= ;reserved=3D0 > >> > > > > >>>>>>>>>> - People who would like to upgrade some depend= encies > >> for > >> > > > > >>> their > >> > > > > >>>>> PRs > >> > > > > >>>>>>>>> will > >> > > > > >>>>>>>>>> still be able to do it - but such upgrades wil= l be in > >> > their > >> > > > > >>> PR > >> > > > > >>>>> thus > >> > > > > >>>>>>>>> they > >> > > > > >>>>>>>>>> will go through TravisCI tests and they will a= lso > >> have to > >> > > be > >> > > > > >>>>>>> specified > >> > > > > >>>>>>>>>> with > >> > > > > >>>>>>>>>> pinned fixed versions (=3D=3D). This should be= part of > >> review > >> > > > > >>>> process > >> > > > > >>>>>> to > >> > > > > >>>>>>>>>> make > >> > > > > >>>>>>>>>> sure new/changed requirements are pinned. > >> > > > > >>>>>>>>>> - In release process there will be a point whe= re an > >> > upgrade > >> > > > > >>>> will > >> > > > > >>>>> be > >> > > > > >>>>>>>>>> attempted for all requirements (using pip-tool= s) so > >> that > >> > we > >> > > > > >>> are > >> > > > > >>>>> not > >> > > > > >>>>>>>>>> stuck > >> > > > > >>>>>>>>>> with older releases. This will be in controlle= d PR > >> > > > > >>> environment > >> > > > > >>>>>> where > >> > > > > >>>>>>>>>> there > >> > > > > >>>>>>>>>> will be time to fix all dependencies without i= mpacting > >> > > others > >> > > > > >>>> and > >> > > > > >>>>>>>>> likely > >> > > > > >>>>>>>>>> enough time to "vet" such changes (this can be= done > >> for > >> > > > > >>>>> alpha/beta > >> > > > > >>>>>>>>>> releases > >> > > > > >>>>>>>>>> for example). > >> > > > > >>>>>>>>>> - As a side effect dependencies specification = will > >> become > >> > > far > >> > > > > >>>>>> simpler > >> > > > > >>>>>>>>>> and straightforward. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> Happy to hear community comments to the proposa= l. I am > >> > happy > >> > > > to > >> > > > > >>>>> take > >> > > > > >>>>>> a > >> > > > > >>>>>>>>> lead > >> > > > > >>>>>>>>>> on that, open JIRA issue and implement if this = is > >> > something > >> > > > > >>>>> community > >> > > > > >>>>>>> is > >> > > > > >>>>>>>>>> happy with. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> J. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> -- > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> *Jarek Potiuk, Principal Software Engineer* > >> > > > > >>>>>>>>>> Mobile: +48 660 796 129 <+48%20660%20796%20129> > >> > <+48%20660%20796%20129> > >> > > > <+48%20660%20796%20129> > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>> > >> > > > > >>>>>>> > >> > > > > >>>>>>> > >> > > > > >>>>>> > >> > > > > >>>>>> -- > >> > > > > >>>>>> > >> > > > > >>>>>> *Jarek Potiuk, Principal Software Engineer* > >> > > > > >>>>>> Mobile: +48 660 796 129 <+48%20660%20796%20129> > >> > <+48%20660%20796%20129> > >> > > > <+48%20660%20796%20129> > >> > > > > >>>>>> > >> > > > > >>>>> > >> > > > > >>>>> > >> > > > > >>