pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Image generation changes from 2.0.6 to 2.0.7
Date Mon, 02 Oct 2017 07:23:17 GMT
Am 02.10.2017 um 00:50 schrieb Joel Hirsh:
> The only difference I can see between PageDrawer in the two versions 
> has to do with interpolation.  I am thinking that 
> somehow interpolation defaults to off in 2.0.6 and on in 2.0.7.  Does 
> that make sense?

No, it was never off. It may have been off in partial aspects, as in 
PDFBOX-3615 <https://issues.apache.org/jira/browse/PDFBOX-3615>.

Tilman

>
> And I'm not sure that interpolating makes an image better if the 
> original data doesn't have the resolution to begin with.  That could 
> account for the fuzziness.
>
> Thanks for the insight.
>
> On Sun, Oct 1, 2017 at 3:22 PM, Tilman Hausherr <THausherr@t-online.de 
> <mailto:THausherr@t-online.de>> wrote:
>
>     Ah, now I get it. Initially I thought you meant size as in height
>     / width.
>
>     There have sometimes been some improvements re image quality; more
>     quality = bigger images. The one on the right is the better one.
>     Nobody would like an image with "blocky" glyphs.
>
>     I remember one issue but that one was fixed in 2.0.4
>     https://issues.apache.org/jira/browse/PDFBOX-3615
>     <https://issues.apache.org/jira/browse/PDFBOX-3615>
>
>     Another one is
>     https://issues.apache.org/jira/browse/PDFBOX-1958
>     <https://issues.apache.org/jira/browse/PDFBOX-1958>
>     but that one was fixed in 2.0.5 (and sometimes made images worse)
>
>     To find out what changed, I'd need to have a specific file to test
>     with. I could then get different versions from the repository and
>     build and test when this happened. (But it would be better if
>     you'd do it)
>
>     But currently we don't offer to switch this on or off. You could
>     change it in the source code, in PageDrawer.java, search for
>     setRenderingHints().
>
>     Tilman
>
>
>
>     Am 02.10.2017 um 00:00 schrieb Joel Hirsh:
>>     My code looks like this
>>
>>     BufferedImage pageimage=
>>     *new*PDFRenderer(pdfdocument).renderImageWithDPI(pagenum, outputres);
>>
>>     ImageIO./write/(pageimage, "png", tmpfile);
>>
>>     where outputres is 300.0f.
>>
>>     The only difference on my side is that in one case I build with
>>     pdfbox-app-2.0.6.jar, the other with pdfbox-app-2.0.7.jar
>>
>>     In both versions I get a full page image that is 2550 x 3300
>>     pixels, with a bit depth of 24bits. However for this one page
>>     file I am looking at the 2.0.6 .png file is 1.53MB and the 2.0.7
>>     file is 7.83MB
>>
>>     If I read both .png files into Photoshop and zoom way in, they
>>     are indeed different.  This is a screenshot of them side by side
>>     with 2.0.6 on the left and 2.0.7 on the right.
>>
>>     Inline image 1
>>
>>      Since gmail seemed to shrink the image when I pasted it in, I
>>     also included it as an attachment.
>>
>>     If you look at the attachment closely you can see the photoshop
>>     pixel numbers on the left. It appears that 2.0.6 actually has a
>>     minimum resolution of 2 pixels, whereas 2.0.7 has a resolution of
>>     1 pixel.  But 2.0.7 seems slightly out of focus.  So as I said
>>     before, I'm not sure which is better, but it's not a difference I
>>     would expect. Or would like to have some control over.
>>
>>
>>     Regards
>>
>>
>>     On Sun, Oct 1, 2017 at 10:19 AM, Tilman Hausherr
>>     <THausherr@t-online.de <mailto:THausherr@t-online.de>> wrote:
>>
>>         Hi,
>>
>>         No idea what you mean. I have a test with about 1000 PDF
>>         files that renders at 96dpi and compares the result. If the
>>         images were bigger / smaller then this would be noticed.
>>
>>         Please explain what size you get from what PDF with what
>>         version and what code.
>>
>>         Tilman
>>
>>
>>         Am 01.10.2017 um 18:55 schrieb Joel Hirsh:
>>
>>             I am using PDFRenderer.renderImageWithDPI and found that
>>             the generated
>>             images have changed substantially from 2.0.6 to 2.0.7. 
>>             Images are 2x to 4x
>>             bigger in 2.07 than 2.0.6 and really impact OCR
>>             processing on those images.
>>
>>             Its hard to say whether or how one is 'better' than the
>>             other, but I'd like
>>             to understand what is happening and maybe how to control
>>             this.  I didn't
>>             see anything in the 2.0.7 release notes that seemed to
>>             indicate a
>>             difference.
>>
>>             I have verified that I can go back and forth between
>>             2.0.6 and 2.0.7 and
>>             get consistent results within each version.  And
>>             everything else is the
>>             same.
>>
>>             I am also using the following jars:
>>             levigo-jbig2-imageio-1.6.5.jar
>>             imageio-jpeg-3.2.1.jar
>>             imageio-metadata-3.2.1.jar
>>             imageio-core-3.2.1.jar
>>             common-image-3.2.1.jar
>>             jai-imageio-core-1.3.1.jar
>>             jai-imageio-jpeg2000-1.3.1_CODICE_1.jar
>>
>>             Any light you can cast on this would be appreciated.
>>
>>
>>
>>         ---------------------------------------------------------------------
>>         To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>         <mailto:users-unsubscribe@pdfbox.apache.org>
>>         For additional commands, e-mail: users-help@pdfbox.apache.org
>>         <mailto:users-help@pdfbox.apache.org>
>>
>>
>>
>>
>>     ---------------------------------------------------------------------
>>     To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
>>     <mailto:users-unsubscribe@pdfbox.apache.org>
>>     For additional commands, e-mail:users-help@pdfbox.apache.org <mailto:users-help@pdfbox.apache.org>
>
>
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message