commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-388) Improve concurrent reads from ZipFile
Date Mon, 24 Apr 2017 04:22:04 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980711#comment-15980711
] 

ASF GitHub Bot commented on COMPRESS-388:
-----------------------------------------

Github user kvr000 commented on the issue:

    https://github.com/apache/commons-compress/pull/21
  
    Additionally, I was thinking about exposing the entry raw stream starting offset and length
via public API so in case of need one can either map it into memory, directly access the raw
data (especially useful when zip is just kind of flat storage, being quite popular in games
but not only). For me it would help to implement off-heap read-only storage, using standard
file format widely supported by lot of tools.
    
    It's quite zip specific (although can be applied to similar containers too) but anyway
the API already has lot of zip specific stuff... That piece of information would have to be
only moved from ZipFile.Entry to ZipFileEntry. What do you think about it? 


> Improve concurrent reads from ZipFile
> -------------------------------------
>
>                 Key: COMPRESS-388
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-388
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Archivers
>    Affects Versions: 1.13
>         Environment: Any
>            Reporter: Zbynek Vyskovsky
>              Labels: patch, performance
>             Fix For: 1.14
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Concurrent reads on the ZipFile archive is terribly slow on multiprocessor systems. On
my 4 CPU laptop it shows 26 reads/s vs 2 reads/s on 100MB samples for example.
> The cause is the use of synchronized blocks to access the underlying file channel. This
may be required for generic SeekableByteChannel but most commonly there is FileChannel implementation
which supports lock-free reading from any position (i.e. using pread/pwrite system calls or
their equivalent).
> With the fix the performance is about 10 times faster (on 4 CPU system, with more processor
the difference should grow significantly).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message