commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Bodewig (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-381) performance issue when using default Wiki/docs bzip2 compression Factory methods
Date Wed, 25 Jan 2017 16:52:26 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15838106#comment-15838106
] 

Stefan Bodewig commented on COMPRESS-381:
-----------------------------------------

We can certainly add it as an option.

But you solution B) has a few problems. First of all it requires you to keep the whole file
in memory, which may be fine for 2MB but not in the general case. Then Commons Compress targets
Java7, which doesn't contain {{Files.readAllBytes}}. We do have our own {{IOUtils.toByteArray}}
but I'm not sure it is faster than using a {{BufferedInputStream}}.

> performance issue when using default Wiki/docs bzip2 compression Factory methods
> --------------------------------------------------------------------------------
>
>                 Key: COMPRESS-381
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-381
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.13
>         Environment: Windows/All
>            Reporter: Dominique De Munck
>            Priority: Minor
>              Labels: documentation, easyfix, performance
>
> Hello
> We are going to use this project's bzip2 implementation as it performed best for our
use case (tested using https://github.com/ning/jvm-compressor-benchmark).
> However, when following the default examples using the wiki/example/javadoc pages (*),
we were hitting a serious performance bottleneck.
> The reason: the default "compress" operation on a file which is suggested, is very slow,
maybe because of disk I/O and lack of caching.
> For a 2 MB tiff file, bzip2 compression takes about 3 seconds with code (A), whereas
code (B) takes only about 0.5 seconds!
> So it would be good to adapt documentation or take a look at bottle neck.
> Kind regards
> Dominique
> >>>
> FileInputStream fin = new FileInputStream(infile);
> BufferedInputStream bufferin = new BufferedInputStream(fin);
> final FileOutputStream outStream = new FileOutputStream(outfile);
> CompressorOutputStream cos = new CompressorStreamFactory()		         .createCompressorOutputStream(CompressorStreamFactory.BZIP2,
outStream);
> IOUtils.copy(fin, cos);
> cos.close();
> >>>
> B:
> <<<<<
> final byte[] uncompressed = Files.readAllBytes(infile.toPath());
> ByteArrayOutputStream rawOut = new ByteArrayOutputStream(uncompressed.length);
> 		
> BZip2CompressorOutputStream out = new BZip2CompressorOutputStream(rawOut, COMPRESSION_LEVEL);
> out.write(uncompressed);
> out.close();
> FileOutputStream fos = new FileOutputStream(outfile);
> rawOut.writeTo(fos);
> fos.close();
> >>>>
> (*)
> Pages with documentation:
> https://wiki.apache.org/commons/Compress
> https://commons.apache.org/proper/commons-compress/examples.html
> https://commons.apache.org/proper/commons-compress/javadocs/api-release/index.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message