airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Potiuk <Jarek.Pot...@polidea.com>
Subject Re: Bring all the "non-official" binaries under Airflow Community control
Date Mon, 22 Jun 2020 13:27:14 GMT
I'd love to see if that's also something that bothers others, not only me -
maybe it's just me being over-cautious :).


Some more context the whole issue was originated by this comment of Aneesh
https://github.com/apache/airflow/pull/9371#discussion_r442657586 with the
"helm-unittest" image by Aneesh.

@Ash - The only problem I recall with images so far was some hadolint image
releases (we did not have it pinned before). And I am not too much worried
about Astronomer's images. We might also simply agree - as community - to
use Astronomer's ones as "official" images if Astronomer makes those
"officially" available :) - in which case they might fall into the 1) camp.

But I think it's a good question to ask what others are doing - I am going
to ask at the build@ devlist to see what other projects/infra and general
ASF approach about it is. I'd love to hear how other Apache projects are
dealing with it.

J.

On Mon, Jun 22, 2020 at 3:08 PM Ash Berlin-Taylor <ash@apache.org> wrote:

> Licensing wise there is no issue from me: The astronomerinc images are
> just re-packaging of the upstream images to apply security fixes so are
> licensed under whatever the original image is (MIT or Apache2 usually,
> else we wouldn't have put them in the helm chart PR)
>
> For background, the reason that we at Astronomer created
> ap-pgbouncer-exporter in the first place is that the upstream package
> does not patch/rebuild to address security vulnerabilities. By taking
> this in to airflow-ext it means we as a project become responsible for
> monitoring and testing that. (And don't be fooled in to thinking the
> free scanners can detect all vulns here, we've found them to be very of
> variable, and questionable accuracy.)
>
> That is a non-trivial amount of work for an open source project.
>
> Has this ever caused us any problems outside of Pip/python dependencies?
> (I'm not aware of any.) For runtime this maybe makes sense (again, I'm
> not yet convinced), but for test-only/dev-only deps this seems like a
> lot of work that we could better spend on working on Airflow. If we pin
> versions of docker image used then the only real risk is a left-pad
> scenario of "I'm deleting all my images" which is a minor risk.
>
> Do any other project do anything like this? I haven't seen it before.
>
> I'd vote for doing nothing and addressing this in specific cases when it
> becomes a problem. Because I do not see using thidy party docker images
> as a risk. I see it as a time saving measure.
>
> -ash
>
> On Jun 22 2020, at 1:42 pm, Jarek Potiuk <Jarek.Potiuk@polidea.com> wrote:
>
> > Hello everyone,
> >
> > TL;DR; I noticed that we are accumulating some dependencies to external
> > binaries (downloads and Docker images) which make the Apache Airflow
> > Community a bit vulnerable to external dependencies.  I would love your
> > comments/opinions on the proposal I made around this.
> >
> > *More explanation/status:*
> >
> > While dependence is fine for officially "released" and "managed" by the
> > owning organizations, I think it is a bit risky to depend on those long
> > term and I think we should aim to bring all those "vulnerable"
> dependencies
> > into community control.
> >
> > I reviewed all our code (or I think all !) looking for such dependencies
> > and prepared an "umbrella" issue where I proposed the approach we can
> take
> > for all such dependencies.
> >
> > I could have missed some - so if you find others feel free to comment/add
> > the new ones.
> > All the details are captured here:
> > https://github.com/apache/airflow/issues/9401 - I discussed the
> > context/motivation/current status and approach we can take for those
> > dependencies.
> >
> > A lot of those dependencies just need review and maybe some updates to
> > latest versions. And I do not think there is a lot to discuss for those.
> >
> > There is one point, however, that requires more deliberate action and
> some
> > decisions I think.
> >
> > We have some dependencies on Docker images that we are using from various
> > sources:
> > 1) officially maintained images
> > 2) images released by organizations that released them for their own
> > purpose, but they are not "officially maintained" by those organizations
> > 3) images released by private individuals
> >
> > While 1) is perfectly OK, I think for 2) and 3) we should bring the
> images
> > to Airflow community management. Here is the list of those images I found
> > that need to be moved to Airflow:
> >
> >   - aneeshkj/helm-unittest
> >   - ashb/apache-rat:0.13-1
> >   - godatadriven/krb5-kdc-server
> >   - polinux/stress (?)
> >   - osixia/openldap:1.2.0
> >   - astronomerinc/ap-statsd-exporter:0.11.0
> >   - astronomerinc/ap-pgbouncer:1.8.1
> >   - astronomerinc/ap-pgbouncer-exporter:0.5.0-1
> >
> >
> > *Proposal*:
> >
> > My proposal is to make a folder in our repository on Github (continue
> with
> > the mono-repo approach we follow) to keep corresponding Dockerfiles and
> > scripts that build and release images from there. Now the only
> > question is
> > where to keep those images. We currently have apache/airflow but I
> > think we
> > should reserve it for airflow images only and we should keep those images
> > elsewhere. Unfortunately, we cannot have "sub-images" of any sort in
> > DockerHub. We are already abusing a bit the "apache/airflow" namespace as
> > we are keeping both CI and production images there (but that's quite
> > OK as
> > the images are similar).
> >
> > My proposal will be to create an* "apache/airflow-ext"* DockerHub
> > repository and keep the images there. They will also be a little
> > abused because we will have to name them with tags - for example:
> >
> >   - apache/airflow-ext:helm-unittest-[version]
> >   - apache/airflow-ext:apache-rat-[version]
> >
> > I am also open to other names for the repo and proposals other ways
> > how to
> > handle that.
> >
> > I believe there is no issue with Licences for either of those images
> (Ash,
> > Kaxil, Fokko - some of the images are Astronomer's/GoDataDriven's ones -
> > can you comment on that ?)  but I believe licensing on all those
> > images are
> > ok for us to copy with attribution (I will double-check that for other
> > images).
> >
> > WDYT?
> >
> > J.
> >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message