commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Lucas (JIRA)" <>
Subject [jira] [Commented] (IMAGING-126) TIFF and PNG images should not be bigger than the ones created by java ImageIO
Date Tue, 25 Feb 2014 13:38:19 GMT


Gary Lucas commented on IMAGING-126:

Thanks for the code sample.

It's been over a year since I've worked on Apache Commons Imaging and it's taking me some
time to get set up again.  I've found the problem for TIFF files and am looking into a solution.
 I imagine that the PNG files have the same problem.

First off, I note that your image is 2550-by-3300 pixels in size or about 8.4 million pixels.
 At 3 bytes per pixel, the uncompressed image would be about 24 megabytes.  Your output is
about 1 megabyte, which is substantially smaller than 24 M, but still bigger than it ought
to be. The good news is that this result indicates that some compression is happening.  It's
just not as efficient as it should be.

In the class, there is a calculation on line 308 where the writeImage
method computes "rowsPerStrip".  In effect, this is the number of rows from the source image
that the image writer will try to compress (each "strip" is compressed independently). The
code looks like this:

     int rowsPerStrip = 64000 / (width * bitsPerSample * samplesPerPixel);

I think that 64000 is an arbitrarily selected number, but it is suspiciously close to 64 K.
  I am not sure what the original author's intention was on this one (he was not fond of adding
comments to his code).  I'll have to look at the TIFF specification to see if there is a reason
for this value.

Anyway, with your  2550-pixel-wide image, this calculation comes out to one row per strip.
 Consequently, the LZW compressor compresses each row of pixels independently.  Of course,
the LZW compressor is usually much better compressing longer texts than it is at compressing
short texts.  Since the image writer is taking only a single row of pixels at a time, the
"texts" are rather shorter than they could be. So the compressor does not achieve as good
ratios as it might.

I hacked the code to use a larger value than 64000 so that the rowsPerStrip value was larger.
This change reduced the size of the file quite a bit.  Unfortunately, now I have a block of
black pixels added to the bottom of the image...  So there is something else about the way
the image writer works that I don't understand yet. 

I'll let you know if I make more progress.


> TIFF and PNG images should not be bigger than the ones created by java ImageIO
> ------------------------------------------------------------------------------
>                 Key: IMAGING-126
>                 URL:
>             Project: Commons Imaging
>          Issue Type: Improvement
>          Components: Format: PNG, Format: TIFF
>    Affects Versions: 1.0
>         Environment: W7
>            Reporter: Tilman Hausherr
>            Priority: Minor
>         Attachments: pdfbox-1870-devicen3-01.png, pdfbox-1870-devicen3-01.tif, pdfbox-1870-devicen3.pdf-1.png,
> I tried to use Apache Imaging for the PDFBOX project (PDFBOX-1734) because of problems
with setting the tiff resolution in java imageio.
> While the code is pretty nice, I found that the generated images are sometimes much bigger
in size than the ones generated by java imageio.
> Example:
> pdfbox-1870-devicen3-01.png 50 KB (imageio)
> pdfbox-1870-devicen3.pdf-1.png 70 KB (imaging)
> pdfbox-1870-devicen3-01.tif 401 KB (imageio)
> pdfbox-1870-devicen3.pdf-1.tif 1063 KB (imaging)

This message was sent by Atlassian JIRA

View raw message