From dev-return-6758-archive-asf-public=cust-asf.ponee.io@airflow.incubator.apache.org Fri Oct 5 09:04:26 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 18D1F180649 for ; Fri, 5 Oct 2018 09:04:25 +0200 (CEST) Received: (qmail 93821 invoked by uid 500); 5 Oct 2018 07:04:20 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 93627 invoked by uid 99); 5 Oct 2018 07:04:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Oct 2018 07:04:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id DDAA81A0F5C for ; Fri, 5 Oct 2018 06:46:29 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.1 X-Spam-Level: X-Spam-Status: No, score=-0.1 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, KAM_SHORT=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=soundcloud.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id RA21PcyJauXp for ; Fri, 5 Oct 2018 06:46:27 +0000 (UTC) Received: from mail-ed1-f46.google.com (mail-ed1-f46.google.com [209.85.208.46]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 7BE0A5F3E2 for ; Fri, 5 Oct 2018 06:46:26 +0000 (UTC) Received: by mail-ed1-f46.google.com with SMTP id y20-v6so4040448eds.10 for ; Thu, 04 Oct 2018 23:46:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=STbHi1tHtQ/rY9tTZdYS9JlRK5z3o3NAn7Q//k2tSCU=; b=ixAwQf++qLB82sPsdgo9U/LrhijS71nbn6nci13AZ3FuK7StLwKLtMXjiDosn7iv8f sYEozjw3S3IkuAB5/uU0BPhStoMa/z7jRPhml6C6UAWx7kljmXF8BHyPEpiZbBKMvEoZ EZ7D/tQoqfMex11W8UqbQ8WfcsWQllGK4HJJ7YvPe125OA74OMFRdgmaGJ+4liDqRtzT YN0erzOr4T0/udMpbZe8KANhQsu8bEtqTA5UdxP9ewT990Fn2twb7RHpmANh3wh3OMO4 /UIZHLjn+rmoG7zGcYwjkG5bZ7AraMvYg0IaN217V4tW6jS2pQTBN4b1QZ4+Y6kyYh9c vFNg== X-Gm-Message-State: ABuFfogz5LkxpQHha3tMJOGA6vWAhWowvpGg2HcZLhR30D8xD7j30Ke+ vX5a62JWrCM9dP9XNAEpSPae4qD1qwH5OA== X-Google-Smtp-Source: ACcGV60z6ky5Fm+9WpOSUOHmrQDVvpL9/TzrjrzCtY9fsI6OHxJxtWjdz1ns7pSZPUd+c7ue8wn7+g== X-Received: by 2002:a17:906:2ce:: with SMTP id 14-v6mr9987315ejk.66.1538721984768; Thu, 04 Oct 2018 23:46:24 -0700 (PDT) Received: from [10.10.13.3] ([80.82.202.196]) by smtp.gmail.com with ESMTPSA id h3-v6sm2288412ede.42.2018.10.04.23.46.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Oct 2018 23:46:23 -0700 (PDT) From: =?utf-8?Q?Bj=C3=B6rn_Pollex?= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Re: Pinning dependencies for Apache Airflow Date: Fri, 5 Oct 2018 08:46:21 +0200 References: To: dev@airflow.incubator.apache.org In-Reply-To: Message-Id: <96E550CB-56EA-4A16-95CE-06D97CCB712E@soundcloud.com> X-Mailer: Apple Mail (2.3445.9.1) Hi all, Have you considered looking into poetry[1]? I=E2=80=99ve had really good = experiences with it, we specifically introduced it into our project = because we were getting version conflicts, and it resolved them just = fine. It properly supports semantic versioning, so package versions have = upper bounds. It also has a full dependency resolver, so even when = package upgrades are available, it will only upgrade if the version = constraints allow it. It does have some issues though, most notably that = it depends on package metadata being correct to properly resolve = dependencies, and that=E2=80=99s not always the case.=20 Cheers, Bj=C3=B6rn [1]: https://poetry.eustace.io/ > On 5. Oct 2018, at 03:58, James Meickle = wrote: >=20 > I suggest not adopting pipenv. It has a nice "first five minutes" demo = but > it's simply not baked enough to depend on as a swap in pip = replacement. We > are in the process of removing it after finding several serious bugs = in our > POC of it. >=20 > On Thu, Oct 4, 2018, 20:30 Alex Guziel = > wrote: >=20 >> FWIW, there's some value in using virtualenv with Docker to isolate >> yourself from your system's Python. >>=20 >> It's worth noting that requirements files can link other requirements >> files, so that would make groups easier, but not that pip in one run = has no >> guarantee of transitive dependencies not conflicting or overriding. = You >> need pip check for that or use --no-deps. >>=20 >> On Thu, Oct 4, 2018 at 5:19 PM Driesprong, Fokko = >> wrote: >>=20 >>> Hi Jarek, >>>=20 >>> Thanks for bringing this up. I missed the discussion on Slack since = I'm >> on >>> holiday, but I saw the thread and it was way too interesting, and >> therefore >>> this email :) >>>=20 >>> This is actually something that we need to address asap. Like you >> mention, >>> we saw it earlier that specific transient dependencies are not = compatible >>> and then we end up with a breaking CI, or even worse, a broken = release. >>> Earlier we had in the setup.py the fixed versions (=3D=3D) and in a = separate >>> requirements.txt the requirements for the CI. This was also far from >>> optimal since we had two versions of the requirements. >>>=20 >>> I like the idea that you are proposing. Maybe we can do an = experiment >> with >>> it, because of the nature of Airflow (orchestrating different = systems), >> we >>> have a huge list of dependencies. To not install everything, we've >> created >>> groups. For example specific libraries when you're using the Google >> Cloud, >>> Elastic, Druid, etc. So I'm curious how it will work with the ` >>> extras_require` of Airflow >>>=20 >>> Regarding the pipenv. I don't use any pipenv/virtualenv anymore. For = me >>> Docker is much easier to work with. I'm also working on a PR to get = rid >> of >>> tox for the testing, and move to a more Docker idiomatic test = pipeline. >>> Curious what you thoughts are on that. >>>=20 >>> Cheers, Fokko >>>=20 >>> Op do 4 okt. 2018 om 15:39 schreef Arthur Wiedmer < >>> arthur.wiedmer@gmail.com >>>> : >>>=20 >>>> Thanks Jakob! >>>>=20 >>>> I think that this is a huge risk of Slack. >>>> I am not against Slack as a support channel, but it is a slippery = slope >>> to >>>> have more and more decisions/conversations happening there, = contrary to >>>> what we hope to achieve with the ASF. >>>>=20 >>>> When we are starting to discuss issues of development, extensions = and >>>> improvements, it is important for the discussion to happen in the >> mailing >>>> list. >>>>=20 >>>> Jarek, I wouldn't worry too much, we are still in the process of >> learning >>>> as a community. Welcome and thank you for your contribution! >>>>=20 >>>> Best, >>>> Arthur. >>>>=20 >>>> On Thu, Oct 4, 2018 at 1:42 PM Jarek Potiuk = >>>> wrote: >>>>=20 >>>>> Thanks for pointing it out Jakob. >>>>>=20 >>>>> I am still very fresh in the ASF community and learning the ropes = and >>>>> etiquette and code of conduct. Apologies for my ignorance. >>>>> I re-read the conduct and FAQ now again - with more understanding = and >>>> will >>>>> pay more attention to wording in the future. As you mentioned it's >> more >>>> the >>>>> wording than intentions, but since it was in TL;DR; it has = stronger >>>>> consequences. >>>>>=20 >>>>> BTW. Thanks for actually following the code of conduct and = pointing >> it >>>> out >>>>> in respectful manner. I really appreciate it. >>>>>=20 >>>>> J. >>>>>=20 >>>>> Principal Software Engineer >>>>> Phone: +48660796129 >>>>>=20 >>>>> On Thu, 4 Oct 2018, 20:41 Jakob Homan, wrote: >>>>>=20 >>>>>>> TL;DR; A change is coming in the way how >> dependencies/requirements >>>> are >>>>>>> specified for Apache Airflow - they will be fixed rather than >>>> flexible >>>>>> (=3D=3D >>>>>>> rather than >=3D). >>>>>>=20 >>>>>>> This is follow up after Slack discussion we had with Ash and >> Kaxil >>> - >>>>>>> summarising what we propose we'll do. >>>>>>=20 >>>>>> Hey all. It's great that we're moving this discussion back from >>> Slack >>>>>> to the mailing list. But I've gotta point out that the wording >> needs >>>>>> a small but critical fix up: >>>>>>=20 >>>>>> "A change *is* coming... they *will* be fixed" >>>>>>=20 >>>>>> needs to be >>>>>>=20 >>>>>> "We'd like to propose a change... We would like to make them >> fixed." >>>>>>=20 >>>>>> The first says that this decision has been made and the result of >> the >>>>>> decision, which was made on Slack, is being reported back to the >>>>>> mailing list. The second is more accurate to the rest of the >>>>>> discussion ('what we propose...'). And again, since it's = axiomatic >>> in >>>>>> ASF that if it didn't happen on a list, it didn't happen[1], we >> gotta >>>>>> make sure there's no confusion about where the community is on = the >>>>>> decision-making process. >>>>>>=20 >>>>>> Thanks, >>>>>> Jakob >>>>>>=20 >>>>>> [1] >>>>>>=20 >>>>>=20 >>>>=20 >>>=20 >> = https://community.apache.org/newbiefaq.html#NewbieFAQ-IsthereaCodeofConduc= tforApacheprojects >>>>>> ? >>>>>=20 >>>>> On Thu, Oct 4, 2018 at 9:56 AM Alex Guziel >>>>>> wrote: >>>>>>>=20 >>>>>>> You should run `pip check` to ensure no conflicts. Pip does not >> do >>>> this >>>>>> on >>>>>>> its own. >>>>>>>=20 >>>>>>> On Thu, Oct 4, 2018 at 9:20 AM Jarek Potiuk < >>>> Jarek.Potiuk@polidea.com> >>>>>>> wrote: >>>>>>>=20 >>>>>>>> Great that this discussion already happened :). Lots of useful >>>> things >>>>>> in >>>>>>>> it. And yes - it means pinning in requirement.txt - this is how >>>>>> pip-tools >>>>>>>> work. >>>>>>>>=20 >>>>>>>> J. >>>>>>>>=20 >>>>>>>> Principal Software Engineer >>>>>>>> Phone: +48660796129 >>>>>>>>=20 >>>>>>>> On Thu, 4 Oct 2018, 18:14 Arthur Wiedmer, < >>>> arthur.wiedmer@gmail.com> >>>>>>>> wrote: >>>>>>>>=20 >>>>>>>>> Hi Jarek, >>>>>>>>>=20 >>>>>>>>> I will +1 the discussion Dan is referring to and George's >>> advice. >>>>>>>>>=20 >>>>>>>>> I just want to double check we are talking about pinning in >>>>>>>>> requirements.txt only. >>>>>>>>>=20 >>>>>>>>> This offers the ability to >>>>>>>>> pip install -r requirements.txt >>>>>>>>> pip install --no-deps airflow >>>>>>>>> For a guaranteed install which works. >>>>>>>>>=20 >>>>>>>>> Several different requirement files can be provided for >>> specific >>>>> use >>>>>>>> cases, >>>>>>>>> like a stable dev one for instance for people wanting to work >>> on >>>>>>>> operators >>>>>>>>> and non-core functions. >>>>>>>>>=20 >>>>>>>>> However, I think we should proactively test in CI against >>>> unpinned >>>>>>>>> dependencies (though it might be a separate case in the >>> matrix) , >>>>> so >>>>>> that >>>>>>>>> we get advance warning if possible that things will break. >>>>>>>>> CI downtime is not a bad thing here, it actually caught a >>> problem >>>>> :) >>>>>>>>>=20 >>>>>>>>> We should unpin as possible in setup.py to only maintain >>> minimum >>>>>> required >>>>>>>>> compatibility. The process of pinning in setup.py is >> extremely >>>>>>>> detrimental >>>>>>>>> when you have a large number of python libraries installed >> with >>>>>> different >>>>>>>>> pinned versions. >>>>>>>>>=20 >>>>>>>>> Best, >>>>>>>>> Arthur >>>>>>>>>=20 >>>>>>>>> On Thu, Oct 4, 2018 at 8:36 AM Dan Davydov >>>>>> >>>>>>>>=20 >>>>>>>>> wrote: >>>>>>>>>=20 >>>>>>>>>> Relevant discussion about this: >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>>=20 >> = https://github.com/apache/incubator-airflow/pull/1809#issuecomment-2575021= 74 >>>>>>>>>>=20 >>>>>>>>>> On Thu, Oct 4, 2018 at 11:25 AM Jarek Potiuk < >>>>>> Jarek.Potiuk@polidea.com >>>>>>>>>=20 >>>>>>>>>> wrote: >>>>>>>>>>=20 >>>>>>>>>>> TL;DR; A change is coming in the way how >>>>>> dependencies/requirements >>>>>>>> are >>>>>>>>>>> specified for Apache Airflow - they will be fixed rather >>> than >>>>>>>> flexible >>>>>>>>>> (=3D=3D >>>>>>>>>>> rather than >=3D). >>>>>>>>>>>=20 >>>>>>>>>>> This is follow up after Slack discussion we had with Ash >>> and >>>>>> Kaxil - >>>>>>>>>>> summarising what we propose we'll do. >>>>>>>>>>>=20 >>>>>>>>>>> *Problem:* >>>>>>>>>>> During last few weeks we experienced quite a few >> downtimes >>> of >>>>>>>> TravisCI >>>>>>>>>>> builds (for all PRs/branches including master) as some of >>> the >>>>>>>>> transitive >>>>>>>>>>> dependencies were automatically upgraded. This because >> in a >>>>>> number of >>>>>>>>>>> dependencies we have >=3D rather than =3D=3D dependencies. >>>>>>>>>>>=20 >>>>>>>>>>> Whenever there is a new release of such dependency, it >>> might >>>>>> cause >>>>>>>>> chain >>>>>>>>>>> reaction with upgrade of transitive dependencies which >>> might >>>>> get >>>>>> into >>>>>>>>>>> conflict. >>>>>>>>>>>=20 >>>>>>>>>>> An example was Flask-AppBuilder vs flask-login transitive >>>>>> dependency >>>>>>>>> with >>>>>>>>>>> click. They started to conflict once AppBuilder has >>> released >>>>>> version >>>>>>>>>>> 1.12.0. >>>>>>>>>>>=20 >>>>>>>>>>> *Diagnosis:* >>>>>>>>>>> Transitive dependencies with "flexible" versions (where >>> =3D >>> is >>>>>> used >>>>>>>>>> instead >>>>>>>>>>> of =3D=3D) is a reason for "dependency hell". We will sooner >> or >>>>>> later hit >>>>>>>>>> other >>>>>>>>>>> cases where not fixed dependencies cause similar problems >>>> with >>>>>> other >>>>>>>>>>> transitive dependencies. We need to fix-pin them. This >>> causes >>>>>>>> problems >>>>>>>>>> for >>>>>>>>>>> both - released versions (cause they stop to work!) and >> for >>>>>>>> development >>>>>>>>>>> (cause they break master builds in TravisCI and prevent >>>> people >>>>>> from >>>>>>>>>>> installing development environment from the scratch. >>>>>>>>>>>=20 >>>>>>>>>>> *Solution:* >>>>>>>>>>>=20 >>>>>>>>>>> - Following the old-but-good post >>>>>>>>>>> https://nvie.com/posts/pin-your-packages/ we are >> going >>> to >>>>>> fix the >>>>>>>>>>> pinned >>>>>>>>>>> dependencies to specific versions (so basically all >>>>>> dependencies >>>>>>>> are >>>>>>>>>>> "fixed"). >>>>>>>>>>> - We will introduce mechanism to be able to upgrade >>>>>> dependencies >>>>>>>>> with >>>>>>>>>>> pip-tools (https://github.com/jazzband/pip-tools). We >>>> might >>>>>> also >>>>>>>>>> take a >>>>>>>>>>> look at pipenv: >>> https://pipenv.readthedocs.io/en/latest/ >>>>>>>>>>> - People who would like to upgrade some dependencies >> for >>>>>> their PRs >>>>>>>>>> will >>>>>>>>>>> still be able to do it - but such upgrades will be in >>>> their >>>>> PR >>>>>>>> thus >>>>>>>>>> they >>>>>>>>>>> will go through TravisCI tests and they will also have >>> to >>>> be >>>>>>>>> specified >>>>>>>>>>> with >>>>>>>>>>> pinned fixed versions (=3D=3D). This should be part of >>> review >>>>>> process >>>>>>>> to >>>>>>>>>>> make >>>>>>>>>>> sure new/changed requirements are pinned. >>>>>>>>>>> - In release process there will be a point where an >>>> upgrade >>>>>> will >>>>>>>> be >>>>>>>>>>> attempted for all requirements (using pip-tools) so >> that >>>> we >>>>>> are >>>>>>>> not >>>>>>>>>>> stuck >>>>>>>>>>> with older releases. This will be in controlled PR >>>>> environment >>>>>>>> where >>>>>>>>>>> there >>>>>>>>>>> will be time to fix all dependencies without impacting >>>>> others >>>>>> and >>>>>>>>>> likely >>>>>>>>>>> enough time to "vet" such changes (this can be done >> for >>>>>> alpha/beta >>>>>>>>>>> releases >>>>>>>>>>> for example). >>>>>>>>>>> - As a side effect dependencies specification will >>> become >>>>> far >>>>>>>>> simpler >>>>>>>>>>> and straightforward. >>>>>>>>>>>=20 >>>>>>>>>>> Happy to hear community comments to the proposal. I am >>> happy >>>> to >>>>>> take >>>>>>>> a >>>>>>>>>> lead >>>>>>>>>>> on that, open JIRA issue and implement if this is >> something >>>>>> community >>>>>>>>> is >>>>>>>>>>> happy with. >>>>>>>>>>>=20 >>>>>>>>>>> J. >>>>>>>>>>>=20 >>>>>>>>>>> -- >>>>>>>>>>>=20 >>>>>>>>>>> *Jarek Potiuk, Principal Software Engineer* >>>>>>>>>>> Mobile: +48 660 796 129 >>>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>>=20 >>=20