pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Divya George <divya.god...@gmail.com>
Subject Problem with creating image for pdf created from Word, CutePDF
Date Thu, 15 May 2014 14:12:32 GMT
Hi,

Our application has a web service that needs to convert the first page of a
pdf document to an image. I'm using the snapshot version 2.0.0 of pdfbox to
accomplish this. This works for some pdf documents, but when I create a pdf
document from Microsoft Word 2013 or CutePDF, it fails to generate the
image.

The final image displayed is this symbol: ÿØ

and this is the information displayed in the logs.

2014-05-14 17:29:45,662 DEBUG [http-bio-8080-exec-5] (TTFGlyph2D.java:227)
- ABCDEE+Calibri: Glyph not found:3
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: PDFOperator{ET}
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: PDFOperator{BT}
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: COSInt{1}
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: COSInt{0}
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: COSInt{0}
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: COSInt{1}
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: COSFloat{72.024}
2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: COSFloat{684.1}
2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: PDFOperator{Tm}
2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token:
COSArray{[COSString{ }]}
2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5]
(PDFStreamEngine.java:246) - processing substream token: PDFOperator{TJ}
2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (Encoding.java:242) -
No character for name space
2014-05-14 17:29:45,665 DEBUG [http-bio-8080-exec-5] (TTFGlyph2D.java:227)
- ABCDEE+Calibri: Glyph not found:3

I tried changing the fonts in Word and also tried using CutePDF to generate
the PDF document, but still see the wrong output. Our application receives
pdfs from different sources and we have no control as to how the pdf is
generated.

Here is the snippet of the code I use.

        PDDocument pdf = PDDocument.load(orginalFileName);
        PDFRenderer renderer = new PDFRenderer(pdf);
        BufferedImage image = renderer.renderImageWithDPI(0, 96);

        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        ImageIO.write( image, "jpg", baos );
        baos.flush();
        baos.close();
        pdf.close();
        return baos;

Please let me know if there is something I'm missing or if I should be
using a different method to create the image. The pdf that I use is
attached.

Thanks in advance,
Divya

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message