commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "BELUGA BEHR (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (COMPRESS-234) Patch: TAR InputStream Huge Speed Improvements
Date Tue, 23 Jul 2013 02:18:48 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715982#comment-13715982
] 

BELUGA BEHR edited comment on COMPRESS-234 at 7/23/13 2:17 AM:
---------------------------------------------------------------

OK,  I have submitted what I believe should be the final version.  It is Archiver_Tar.3.patch.

Notes: It still fails that same unit test.  I was wrong with my previous assumption as to
the cause.  I think the unit test here is at fault.  The test runs an archive with random
data appended onto the end of it.  I believe that the archive file has a lot of zero padding
at the end.  The TarBuffer may have read that off as it was configured to read blocks of 10
* 512 chunks at one time.  I believe that a new file is required for this test.  Can you please
verify and regenerate?

I have removed internal buffering and pesky tabs.

To answer your question about the skip Long.MAX_VALUE.  The skip function only lets you skip
the length of the current archive.  It didn't seem necessary to me that I needed to calculate
how far to jump, call jump, then be checked again.  Passing in Long.MAX_VALUE just lets the
skip function handle the details in one place.
                
      was (Author: belugabehr):
    OK,  I have submitted what I believe should be the final version.  It is Archiver_Tar.3.patch.

Notes: It still fails that same unit test.  I think the unit test here is at fault.  The test
runs an archive with random data appended onto the end of it.  I believe that the archive
file has a lot of zero padding at the end.  The TarBuffer may have read that off as it was
configured to read blocks of 10 * 512 chunks at one time.  I believe that a new file is required
for this test.  Can you please verify and regenerate?


                  
> Patch: TAR InputStream Huge Speed Improvements
> ----------------------------------------------
>
>                 Key: COMPRESS-234
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-234
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Archivers
>            Reporter: BELUGA BEHR
>         Attachments: Archiver_Tar.2.patch, Archiver_Tar.3.patch, Archiver_Tar.patch,
TarArchiveInputStream.java.patch, TarBuffer.java.patch
>
>
> I have looked over TarBuffer And TarArchiveInputStream and found some ways to improve
performance orders of magnitude.
> I used a 1 GB TAR archive file (no compression).
> Times for reading all entry file names:
> Current - 630ms
> Mine - 17ms
> Times for extracting all entry files:
> Current 2446ms
> Mine - 2214ms
> As you can see, I have enhanced the "skip" methods greatly.  Actual extraction was within
a margin of error and the timings bounces around a lot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message