commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominique De Munck (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-381) performance issue when using default Wiki/docs bzip2 compression Factory methods
Date Wed, 25 Jan 2017 17:52:26 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15838236#comment-15838236
] 

Dominique De Munck commented on COMPRESS-381:
---------------------------------------------

Thx for the caveat. It isn't the Files.readAllBytes which enhances the speed, I used to have
a small while loop there.

I first tried to work in analogy with the "decompress" example on the Examples page, but somehow
couldn't get it to work. That code also works fast 
(about 200ms for decompress). So maybe someone can add an analogy for the compressor?

decompress snippet from Examples page:

FileInputStream fin = new FileInputStream("archive.tar.bz2");
BufferedInputStream in = new BufferedInputStream(fin);
FileOutputStream out = new FileOutputStream("archive.tar");
BZip2CompressorInputStream bzIn = new BZip2CompressorInputStream(in);
final byte[] buffer = new byte[buffersize];
int n = 0;
while (-1 != (n = bzIn.read(buffer))) {
    out.write(buffer, 0, n);
}
out.close();
bzIn.close();

> performance issue when using default Wiki/docs bzip2 compression Factory methods
> --------------------------------------------------------------------------------
>
>                 Key: COMPRESS-381
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-381
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.13
>         Environment: Windows/All
>            Reporter: Dominique De Munck
>            Priority: Minor
>              Labels: documentation, easyfix, performance
>
> Hello
> We are going to use this project's bzip2 implementation as it performed best for our
use case (tested using https://github.com/ning/jvm-compressor-benchmark).
> However, when following the default examples using the wiki/example/javadoc pages (*),
we were hitting a serious performance bottleneck.
> The reason: the default "compress" operation on a file which is suggested, is very slow,
maybe because of disk I/O and lack of caching.
> For a 2 MB tiff file, bzip2 compression takes about 3 seconds with code (A), whereas
code (B) takes only about 0.5 seconds!
> So it would be good to adapt documentation or take a look at bottle neck.
> Kind regards
> Dominique
> >>>
> FileInputStream fin = new FileInputStream(infile);
> BufferedInputStream bufferin = new BufferedInputStream(fin);
> final FileOutputStream outStream = new FileOutputStream(outfile);
> CompressorOutputStream cos = new CompressorStreamFactory()		         .createCompressorOutputStream(CompressorStreamFactory.BZIP2,
outStream);
> IOUtils.copy(fin, cos);
> cos.close();
> >>>
> B:
> <<<<<
> final byte[] uncompressed = Files.readAllBytes(infile.toPath());
> ByteArrayOutputStream rawOut = new ByteArrayOutputStream(uncompressed.length);
> 		
> BZip2CompressorOutputStream out = new BZip2CompressorOutputStream(rawOut, COMPRESSION_LEVEL);
> out.write(uncompressed);
> out.close();
> FileOutputStream fos = new FileOutputStream(outfile);
> rawOut.writeTo(fos);
> fos.close();
> >>>>
> (*)
> Pages with documentation:
> https://wiki.apache.org/commons/Compress
> https://commons.apache.org/proper/commons-compress/examples.html
> https://commons.apache.org/proper/commons-compress/javadocs/api-release/index.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message