nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Wilcsinszky <peterwilcsins...@gmail.com>
Subject Re: [DISCUSS] Tar + Gzip vs. Zip
Date Mon, 30 Jul 2018 09:46:06 GMT
Hi!

I've created a PR to include the toolkit in the NiFi Docker image and also
to add the changes discussed in this topic:
- use multistage build to avoid doubling the image size
- use Zip instead of the Tar+Gzip

https://issues.apache.org/jira/browse/NIFI-5468
https://github.com/apache/nifi/pull/2921

Cheers,
Peter

On Fri, Jun 29, 2018 at 1:41 PM Peter Wilcsinszky <
peterwilcsinszky@gmail.com> wrote:

> Yes, I mean with this (multistage build) we cannot get rid of the two
> separate modules (maven and dockerhub) but we can get rid of the ADD
> instruction which I think has the benefit of making the build clearer and
> more explicit as well.
>
> On Fri, Jun 29, 2018 at 1:23 PM Aldrin Piri <aldrinpiri@gmail.com> wrote:
>
>> Hi Peter,
>>
>> I remember seeing this but the criteria about working only on Mac and
>> Windows makes it a challenge, in my opinion.
>>
>> I also need to apologize as I certainly confused the Dockerfiles between
>> the Maven plugin and the Docker Hub.  My prior email should have been
>> directed toward the Maven scenario as that is using the ADD.  Docker Hub
>> will just require an updating of the curl command to the .zip extension
>> and
>> we should be set.  Regardless, Andy, when you make the issue for this
>> change feel free to create a subtask of that to update the Dockerfiles.
>> Looks like Peter is up to the task but I am also happy to help make the
>> adjustments and verify.  The first linked item you provided is the
>> multistage approach mentioned.  Multistage builds allow you to effectively
>> create throw away images only selecting specific artifacts from them to
>> use
>> in a new image.
>>
>> Thanks!
>> --aldrin
>>
>> On Fri, Jun 29, 2018 at 7:11 AM Peter Wilcsinszky <
>> peterwilcsinszky@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > I wrote about a different solution for which I implemented a PoC for in
>> >
>> >
>> https://lists.apache.org/thread.html/6122674030b8f99a63d586dcdbdaf6b31841572aed63fcc9dcfb5eea@%3Cdev.nifi.apache.org%3E
>> > but multistage build could be a better option and I'm happy to create an
>> > issue and fix it for the next release.
>> >
>> > On Fri, Jun 29, 2018 at 3:42 AM Andy LoPresto <alopresto@apache.org>
>> > wrote:
>> >
>> > > Thanks Aldrin. I am not knowledgeable on Docker — do either of these
>> > > options help us? We could also use a RUN to curl the Zip resource and
>> > COPY
>> > > the unzipped directory?
>> > >
>> > > [1] https://github.com/moby/moby/issues/15036#issuecomment-322177465
>> > > [2] https://github.com/jlhawn/dockramp
>> > >
>> > >
>> > > Andy LoPresto
>> > > alopresto@apache.org
>> > > *alopresto.apache@gmail.com <alopresto.apache@gmail.com>*
>> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> > >
>> > > On Jun 28, 2018, at 6:22 PM, Aldrin Piri <aldrinpiri@gmail.com>
>> wrote:
>> > >
>> > > Be mindful to also update the Dockerfile used for Docker Hub as this
>> will
>> > > require some adjustments.  Unfortunately, the ADD instruction does not
>> > > support zip files.  This isn't a major inconvenience but will require
>> a
>> > > multi-stage build to help keep our image size svelte.  I believe we
>> > should
>> > > be safe as we have been publishing both tarballs and zips for prior
>> > > releases, so the Dockerfile should still work in that scenario.
>> > >
>> > > On Wed, Jun 27, 2018 at 4:06 PM Andy LoPresto <alopresto@apache.org>
>> > > wrote:
>> > >
>> > > Thanks for everyone’s input. It seems to be a clear consensus to
>> > eliminate
>> > > .tar.gz and only provide .zip moving forward. I’d like to keep this
>> > > discussion thread going for another day or two to field any
>> objections.
>> > > After that time (Friday-ish), I’ll create a Jira to do this unless
>> things
>> > > change.
>> > >
>> > > I will probably keep the possibility to generate the .tar.gz through
>> an
>> > > inactive profile to allow people who need that offering to use it.
>> There
>> > > will be a subtask Jira to update the release guide moving forward as
>> > well.
>> > >
>> > >
>> > > Andy LoPresto
>> > > alopresto@apache.org
>> > > *alopresto.apache@gmail.com <alopresto.apache@gmail.com>*
>> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> > >
>> > > On Jun 26, 2018, at 7:52 PM, James Wing <jvwing@gmail.com> wrote:
>> > >
>> > > It's a great idea, Andy, I strongly support just one format.  I think
>> Zip
>> > > is a good choice.
>> > >
>> > > On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ottobackwards@gmail.com
>> >
>> > > wrote:
>> > >
>> > > I end up using zip all the time.  zip +1
>> > >
>> > >
>> > > On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
>> > >
>> > > My preference is zip.
>> > >
>> > > On Tue, Jun 26, 2018, 9:21 AM Josh Elser <elserj@apache.org> wrote:
>> > >
>> > >
>> > >
>> > > On 6/25/18 11:34 PM, Andy LoPresto wrote:
>> > >
>> > > Hi folks,
>> > >
>> > > I do not want to start a long-running argument or entrenched battle.
>> > > However, having just performed the RM duties for the latest release, I
>> > > believe I have identified a resource inefficiency in the fact that we
>> > > generate, upload, host, and distribute two compressed archives of the
>> > > binary which are functionally equivalent. For 1.7.0, both the .tar.gz
>> > > and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
>> > > 1_224_392_000 bytes for zip). The time to build and sign these is
>> > > substantial, but the true cost comes in uploading and hosting them.
>> > > While the fabled extension registry will save all of us from this
>> > > burden, it isn’t arriving tomorrow, and I think we could drastically
>> > > improve this before the next release.
>> > >
>> > > I have no personal preference between the two formats. In earlier
>> days,
>> > > there were platform inconsistencies and the tools weren’t available on
>> > > all systems, but now they are pretty standard for all users. This [1]
>> > >
>> > > is
>> > >
>> > > an interesting article I found which had some good info on the
>> origins,
>> > > and here are some additional resources for anyone interested [2][3]. I
>> > > don’t care which we pick, but I propose removing one of the options
>> for
>> > > the build going forward (toolkit as well).
>> > >
>> > > That said, if someone has a good reason that both are necessary, I
>> > >
>> > > would
>> > >
>> > > love to hear it. I didn’t find anything on the Apache Release Policy
>> > > which stated we must offer both, but maybe I missed it. Thanks.
>> > >
>> > >
>> > > I'm not aware of any ASF policy. I think it mostly stems from default
>> > > convention you get out of the maven-assembly-plugin.
>> > >
>> > > [1] https://itsfoss.com/tar-vs-zip-vs-gz/
>> > > [2] https://superuser.com/a/1257441/40003
>> > > [3] https://superuser.com/a/173995/40003
>> > > [4] https://www.apache.org/legal/release-policy.html#artifacts
>> > >
>> > >
>> > > Andy LoPresto
>> > > alopresto@apache.org <mailto:alopresto@apache.org <
>> alopresto@apache.org
>> > >>
>> > > /alopresto.apache@gmail.com <mailto:alopresto.apache@gmail.com
>> > > <alopresto.apache@gmail.com>>/
>> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message