commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (COMPRESS-375) Allow the clients of ParallelScatterZipCreator to provide ZipArchiveEntryRequestSupplier
Date Sat, 03 Dec 2016 17:38:59 GMT


ASF GitHub Bot commented on COMPRESS-375:

GitHub user plamentotev opened a pull request:

    COMPRESS-375 Allow the differed creation of `ZipArchiveEntry` for parallel zips

    This is my attempt to implement the changes suggested in [COMPRESS-375](

You can merge this pull request into a Git repository by running:

    $ git pull COMPRESS-375-ZipArchiveEntryRequestSupplier

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12
commit 60fe42d6fc19e9e4dd4e43aafeedcfdc33915049
Author: Plamen Totev <>
Date:   2016-12-03T17:17:09Z

    COMPRESS-375 Allow the differed creation of `ZipArchiveEntry` for parallel zips
    In some cases when creating parallel zip archive the `ZipArchiveEntry`
    to be added could not be created before the `InputStream` is read.
    In those cases there is no point in passing `ZipArchiveEntry` and
    `InputStreamSupplier` as you can't actually differ the creation
    of the `InputStream` as it's needed for the `ZipArchiveEntry`.
    Add `ZipArchiveEntryRequestSupplier` to allow the differed
    creation of both `ZipArchiveEntry` and `InputStream`.


> Allow the clients of ParallelScatterZipCreator to provide ZipArchiveEntryRequestSupplier
> ----------------------------------------------------------------------------------------
>                 Key: COMPRESS-375
>                 URL:
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Archivers
>            Reporter: Plamen Totev
> Currently clients of {{ParallelScatterZipCreator}} could provide {{ZipArchiveEntry}}
and {{InputStreamSupplier}} through {{ParallelScatterZipCreator#addArchiveEntry}}. From those
two a {{ZipArchiveEntryRequest}} is created. Providing {{InputStreamSupplier}} solves the
problem with opening too many files - streams are opened just-in-time - when an entry is 
compressed, not when it's submitted.
> But there are use cases when the stream may contain information about the {{ZipArchiveEntry}}.
In those cases creating {{ZipArchiveEntry}} before the {{InputStream}} is opened won't work.
If there is an option to supply both {{ZipArchiveEntry}} and {{InputStreamSupplier}} ({{ZipArchiveEntryRequest}}),
this will solve the issue.
> There is a bug in Plexus Archiver (
that is example for such use case. Plexus Archiver have option that allows entries that are
already zip files to be stored instead of compressed ({{AbstractZipArchiver.recompressAddedZips}}).
To detect if given entry is zip archive, {{AbstractZipArchiver}} should read the first several
bytes of the stream. So creating {{ZipArchiveEntry}} before the stream is opened is not useful
- the compress mode is not known. Opening the stream when the  {{ZipArchiveEntry}} is created
won't work either. Because you can add entries to {{ParallelScatterZipCreator}} a lot faster
than you could compress them you could open too many files very fast. And I don't think opening
and closing the stream is an option as such operations could be relatively expensive in the
general case. But if it could supply both the {{ZipArchiveEntry}} and the {{InputStream}}
just-in-time (by passing {{ZipArchiveEntryRequestSupplier}} to {{ParallelScatterZipCreator}})
then the problem is solved.
> What do you think? Does the addition of {{ParallelScatterZipCreator#addArchiveEntry(ZipArchiveEntryRequestSupplier)}}
makes sense?

This message was sent by Atlassian JIRA

View raw message