pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Hirsh <joelehi...@gmail.com>
Subject Re: Image generation changes from 2.0.6 to 2.0.7
Date Sun, 01 Oct 2017 22:00:00 GMT
My code looks like this

BufferedImage pageimage = *new* PDFRenderer(pdfdocument).renderImageWithDPI(
pagenum, outputres);

ImageIO.*write*(pageimage, "png", tmpfile);

where outputres is 300.0f.

The only difference on my side is that in one case I build with
pdfbox-app-2.0.6.jar, the other with pdfbox-app-2.0.7.jar

In both versions I get a full page image that is 2550 x 3300 pixels, with a
bit depth of 24bits. However for this one page file I am looking at the
2.0.6 .png file is 1.53MB and the 2.0.7 file is 7.83MB

If I read both .png files into Photoshop and zoom way in, they are indeed
different.  This is a screenshot of them side by side with 2.0.6 on the
left and 2.0.7 on the right.

[image: Inline image 1]

 Since gmail seemed to shrink the image when I pasted it in, I also
included it as an attachment.

If you look at the attachment closely you can see the photoshop pixel
numbers on the left.  It appears that 2.0.6 actually has a
minimum resolution of 2 pixels, whereas 2.0.7 has a resolution of 1 pixel.
But 2.0.7 seems slightly out of focus.  So as I said before, I'm not sure
which is better, but it's not a difference I would expect. Or would like to
have some control over.


Regards

On Sun, Oct 1, 2017 at 10:19 AM, Tilman Hausherr <THausherr@t-online.de>
wrote:

> Hi,
>
> No idea what you mean. I have a test with about 1000 PDF files that
> renders at 96dpi and compares the result. If the images were bigger /
> smaller then this would be noticed.
>
> Please explain what size you get from what PDF with what version and what
> code.
>
> Tilman
>
>
> Am 01.10.2017 um 18:55 schrieb Joel Hirsh:
>
>> I am using PDFRenderer.renderImageWithDPI and found that the generated
>> images have changed substantially from 2.0.6 to 2.0.7.  Images are 2x to
>> 4x
>> bigger in 2.07 than 2.0.6 and really impact OCR processing on those
>> images.
>>
>> Its hard to say whether or how one is 'better' than the other, but I'd
>> like
>> to understand what is happening and maybe how to control this.  I didn't
>> see anything in the 2.0.7 release notes that seemed to indicate a
>> difference.
>>
>> I have verified that I can go back and forth between 2.0.6 and 2.0.7 and
>> get consistent results within each version.  And everything else is the
>> same.
>>
>> I am also using the following jars:
>> levigo-jbig2-imageio-1.6.5.jar
>> imageio-jpeg-3.2.1.jar
>> imageio-metadata-3.2.1.jar
>> imageio-core-3.2.1.jar
>> common-image-3.2.1.jar
>> jai-imageio-core-1.3.1.jar
>> jai-imageio-jpeg2000-1.3.1_CODICE_1.jar
>>
>> Any light you can cast on this would be appreciated.
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Mime
View raw message