commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-234) Patch: TAR InputStream Huge Speed Improvements
Date Sun, 21 Jul 2013 10:52:49 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13714682#comment-13714682
] 

Sebb commented on COMPRESS-234:
-------------------------------

If we do decide a BufferedStream is needed, we could check first if one is provided, and only
wrap a buffer around it if required. There's code in Commons IO (2.5; IOUtils asBufferedxxx
methods) that could be borrowed:

{code}
public static BufferedInputStream asBufferedInputStream(final InputStream inputStream) {
    // reject null early on rather than waiting for IO operation to fail
    if (inputStream == null) { // not checked by BufferedInputStream
        throw new NullPointerException();
    }
    return inputStream instanceof BufferedInputStream ? (BufferedInputStream) inputStream
: new BufferedInputStream(inputStream);
}
{code}
                
> Patch: TAR InputStream Huge Speed Improvements
> ----------------------------------------------
>
>                 Key: COMPRESS-234
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-234
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Archivers
>            Reporter: BELUGA BEHR
>         Attachments: Archiver_Tar.patch, TarArchiveInputStream.java.patch, TarBuffer.java.patch
>
>
> I have looked over TarBuffer And TarArchiveInputStream and found some ways to improve
performance orders of magnitude.
> I used a 1 GB TAR archive file (no compression).
> Times for reading all entry file names:
> Current - 630ms
> Mine - 17ms
> Times for extracting all entry files:
> Current 2446ms
> Mine - 2214ms
> As you can see, I have enhanced the "skip" methods greatly.  Actual extraction was within
a margin of error and the timings bounces around a lot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message