airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Potiuk <Jarek.Pot...@polidea.com>
Subject Re: Bring all the "non-official" binaries under Airflow Community control
Date Mon, 22 Jun 2020 14:00:41 GMT
I created thread at builds@apache.org:

https://lists.apache.org/thread.html/rf2af2a95e7687fe94ede23fe9df388f784c8231a5968b109f677cbe8%40%3Cbuilds.apache.org%3E

Let's see what other projects/infra say about this.

J.


On Mon, Jun 22, 2020 at 3:27 PM Jarek Potiuk <Jarek.Potiuk@polidea.com>
wrote:

> I'd love to see if that's also something that bothers others, not only me
> - maybe it's just me being over-cautious :).
>
>
> Some more context the whole issue was originated by this comment of Aneesh
> https://github.com/apache/airflow/pull/9371#discussion_r442657586 with
> the "helm-unittest" image by Aneesh.
>
> @Ash - The only problem I recall with images so far was some hadolint
> image releases (we did not have it pinned before). And I am not too much
> worried about Astronomer's images. We might also simply agree - as
> community - to use Astronomer's ones as "official" images if
> Astronomer makes those "officially" available :) - in which case they might
> fall into the 1) camp.
>
> But I think it's a good question to ask what others are doing - I am going
> to ask at the build@ devlist to see what other projects/infra and general
> ASF approach about it is. I'd love to hear how other Apache projects are
> dealing with it.
>
> J.
>
> On Mon, Jun 22, 2020 at 3:08 PM Ash Berlin-Taylor <ash@apache.org> wrote:
>
>> Licensing wise there is no issue from me: The astronomerinc images are
>> just re-packaging of the upstream images to apply security fixes so are
>> licensed under whatever the original image is (MIT or Apache2 usually,
>> else we wouldn't have put them in the helm chart PR)
>>
>> For background, the reason that we at Astronomer created
>> ap-pgbouncer-exporter in the first place is that the upstream package
>> does not patch/rebuild to address security vulnerabilities. By taking
>> this in to airflow-ext it means we as a project become responsible for
>> monitoring and testing that. (And don't be fooled in to thinking the
>> free scanners can detect all vulns here, we've found them to be very of
>> variable, and questionable accuracy.)
>>
>> That is a non-trivial amount of work for an open source project.
>>
>> Has this ever caused us any problems outside of Pip/python dependencies?
>> (I'm not aware of any.) For runtime this maybe makes sense (again, I'm
>> not yet convinced), but for test-only/dev-only deps this seems like a
>> lot of work that we could better spend on working on Airflow. If we pin
>> versions of docker image used then the only real risk is a left-pad
>> scenario of "I'm deleting all my images" which is a minor risk.
>>
>> Do any other project do anything like this? I haven't seen it before.
>>
>> I'd vote for doing nothing and addressing this in specific cases when it
>> becomes a problem. Because I do not see using thidy party docker images
>> as a risk. I see it as a time saving measure.
>>
>> -ash
>>
>> On Jun 22 2020, at 1:42 pm, Jarek Potiuk <Jarek.Potiuk@polidea.com>
>> wrote:
>>
>> > Hello everyone,
>> >
>> > TL;DR; I noticed that we are accumulating some dependencies to external
>> > binaries (downloads and Docker images) which make the Apache Airflow
>> > Community a bit vulnerable to external dependencies.  I would love your
>> > comments/opinions on the proposal I made around this.
>> >
>> > *More explanation/status:*
>> >
>> > While dependence is fine for officially "released" and "managed" by the
>> > owning organizations, I think it is a bit risky to depend on those long
>> > term and I think we should aim to bring all those "vulnerable"
>> dependencies
>> > into community control.
>> >
>> > I reviewed all our code (or I think all !) looking for such dependencies
>> > and prepared an "umbrella" issue where I proposed the approach we can
>> take
>> > for all such dependencies.
>> >
>> > I could have missed some - so if you find others feel free to
>> comment/add
>> > the new ones.
>> > All the details are captured here:
>> > https://github.com/apache/airflow/issues/9401 - I discussed the
>> > context/motivation/current status and approach we can take for those
>> > dependencies.
>> >
>> > A lot of those dependencies just need review and maybe some updates to
>> > latest versions. And I do not think there is a lot to discuss for those.
>> >
>> > There is one point, however, that requires more deliberate action and
>> some
>> > decisions I think.
>> >
>> > We have some dependencies on Docker images that we are using from
>> various
>> > sources:
>> > 1) officially maintained images
>> > 2) images released by organizations that released them for their own
>> > purpose, but they are not "officially maintained" by those organizations
>> > 3) images released by private individuals
>> >
>> > While 1) is perfectly OK, I think for 2) and 3) we should bring the
>> images
>> > to Airflow community management. Here is the list of those images I
>> found
>> > that need to be moved to Airflow:
>> >
>> >   - aneeshkj/helm-unittest
>> >   - ashb/apache-rat:0.13-1
>> >   - godatadriven/krb5-kdc-server
>> >   - polinux/stress (?)
>> >   - osixia/openldap:1.2.0
>> >   - astronomerinc/ap-statsd-exporter:0.11.0
>> >   - astronomerinc/ap-pgbouncer:1.8.1
>> >   - astronomerinc/ap-pgbouncer-exporter:0.5.0-1
>> >
>> >
>> > *Proposal*:
>> >
>> > My proposal is to make a folder in our repository on Github (continue
>> with
>> > the mono-repo approach we follow) to keep corresponding Dockerfiles and
>> > scripts that build and release images from there. Now the only
>> > question is
>> > where to keep those images. We currently have apache/airflow but I
>> > think we
>> > should reserve it for airflow images only and we should keep those
>> images
>> > elsewhere. Unfortunately, we cannot have "sub-images" of any sort in
>> > DockerHub. We are already abusing a bit the "apache/airflow" namespace
>> as
>> > we are keeping both CI and production images there (but that's quite
>> > OK as
>> > the images are similar).
>> >
>> > My proposal will be to create an* "apache/airflow-ext"* DockerHub
>> > repository and keep the images there. They will also be a little
>> > abused because we will have to name them with tags - for example:
>> >
>> >   - apache/airflow-ext:helm-unittest-[version]
>> >   - apache/airflow-ext:apache-rat-[version]
>> >
>> > I am also open to other names for the repo and proposals other ways
>> > how to
>> > handle that.
>> >
>> > I believe there is no issue with Licences for either of those images
>> (Ash,
>> > Kaxil, Fokko - some of the images are Astronomer's/GoDataDriven's ones -
>> > can you comment on that ?)  but I believe licensing on all those
>> > images are
>> > ok for us to copy with attribution (I will double-check that for other
>> > images).
>> >
>> > WDYT?
>> >
>> > J.
>> >
>> >
>> >
>> > --
>> >
>> > Jarek Potiuk
>> > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >
>> > M: +48 660 796 129 <+48660796129>
>> > [image: Polidea] <https://www.polidea.com/>
>> >
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message