Return-Path: X-Original-To: apmail-pdfbox-users-archive@www.apache.org Delivered-To: apmail-pdfbox-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 89429113D9 for ; Fri, 16 May 2014 18:31:43 +0000 (UTC) Received: (qmail 25484 invoked by uid 500); 16 May 2014 17:00:28 -0000 Delivered-To: apmail-pdfbox-users-archive@pdfbox.apache.org Received: (qmail 66575 invoked by uid 500); 16 May 2014 16:58:22 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Delivered-To: moderator for users@pdfbox.apache.org Received: (qmail 64292 invoked by uid 99); 16 May 2014 11:19:09 -0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of divya.godwin@gmail.com designates 74.125.82.41 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=qGyW4Zv88zi1dVXGpDqhHHF6LScQsDSfs+NjGxgCx+A=; b=jBLgqffcCWhjsCsuVEh8SSa/5+b/LHawaeBtwRCWot0pDED9JRuAFAKa6+yLYZUp5s qnzlhbNtgay+lQUQzVQnscsJ/FPXG2E2YSu6v6I5puywasYEemdA7tIDoT1fZbvzb3bJ L+eOOcs0y6UHq28NZNx/0kaUfcTWU+QTteUrYcBrcXgiQXCQAewKy3NY4FjIdSSZcoKw 5drWZUhvxKn76wOkaITb/oNgf11lC6Xq+PcmtHJJzxJp4p8gQAV0BS37ZoRSmJN32sQ1 zu9EqS0COXaPXmvyru5uGef00OELq4PV++B8dhDa7O9T3zdX1Axz36N5BbtbNoVwtvgI LX6w== MIME-Version: 1.0 X-Received: by 10.194.100.67 with SMTP id ew3mr2771493wjb.57.1400163152886; Thu, 15 May 2014 07:12:32 -0700 (PDT) Date: Thu, 15 May 2014 10:12:32 -0400 Message-ID: Subject: Problem with creating image for pdf created from Word, CutePDF From: Divya George To: users@pdfbox.apache.org Content-Type: multipart/mixed; boundary=089e0160c2725b32d904f970e32a X-Virus-Checked: Checked by ClamAV on apache.org --089e0160c2725b32d904f970e32a Content-Type: multipart/alternative; boundary=089e0160c2725b32d604f970e328 --089e0160c2725b32d604f970e328 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, Our application has a web service that needs to convert the first page of a pdf document to an image. I'm using the snapshot version 2.0.0 of pdfbox to accomplish this. This works for some pdf documents, but when I create a pdf document from Microsoft Word 2013 or CutePDF, it fails to generate the image. The final image displayed is this symbol: =C3=BF=C3=98 and this is the information displayed in the logs. 2014-05-14 17:29:45,662 DEBUG [http-bio-8080-exec-5] (TTFGlyph2D.java:227) - ABCDEE+Calibri: Glyph not found:3 2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: PDFOperator{ET} 2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: PDFOperator{BT} 2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: COSInt{1} 2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: COSInt{0} 2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: COSInt{0} 2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: COSInt{1} 2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: COSFloat{72.024} 2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: COSFloat{684.1} 2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: PDFOperator{Tm} 2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: COSArray{[COSString{ }]} 2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream token: PDFOperator{TJ} 2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (Encoding.java:242) - No character for name space 2014-05-14 17:29:45,665 DEBUG [http-bio-8080-exec-5] (TTFGlyph2D.java:227) - ABCDEE+Calibri: Glyph not found:3 I tried changing the fonts in Word and also tried using CutePDF to generate the PDF document, but still see the wrong output. Our application receives pdfs from different sources and we have no control as to how the pdf is generated. Here is the snippet of the code I use. PDDocument pdf =3D PDDocument.load(orginalFileName); PDFRenderer renderer =3D new PDFRenderer(pdf); BufferedImage image =3D renderer.renderImageWithDPI(0, 96); ByteArrayOutputStream baos =3D new ByteArrayOutputStream(); ImageIO.write( image, "jpg", baos ); baos.flush(); baos.close(); pdf.close(); return baos; Please let me know if there is something I'm missing or if I should be using a different method to create the image. The pdf that I use is attached. Thanks in advance, Divya --089e0160c2725b32d604f970e328 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

Our application has a web servi= ce that needs to convert the first page of a pdf document to an image. I= 9;m using the snapshot version 2.0.0 of pdfbox to accomplish this. This wor= ks for some pdf documents, but when I create a pdf document from Microsoft = Word 2013 or CutePDF, it fails to generate the image.

The final image displayed is this symbol: =C3=BF=C3=98

and this is the information displayed in the logs.

2014-05-14 17= :29:45,662 DEBUG [http-bio-8080-exec-5] (TTFGlyph2D.java:227) - ABCDEE+Cali= bri: Glyph not found:3
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec= -5] (PDFStreamEngine.java:246) - processing substream token: PDFOperator{ET= }
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:= 246) - processing substream token: PDFOperator{BT}
2014-05-14 17:29:45,6= 63 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing sub= stream token: COSInt{1}
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:= 246) - processing substream token: COSInt{0}
2014-05-14 17:29:45,663 DEB= UG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream= token: COSInt{0}
2014-05-14 17:29:45,663 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:= 246) - processing substream token: COSInt{1}
2014-05-14 17:29:45,663 DEB= UG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing substream= token: COSFloat{72.024}
2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:= 246) - processing substream token: COSFloat{684.1}
2014-05-14 17:29:45,6= 64 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - processing sub= stream token: PDFOperator{Tm}
2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:= 246) - processing substream token: COSArray{[COSString{ }]}
2014-05-14 1= 7:29:45,664 DEBUG [http-bio-8080-exec-5] (PDFStreamEngine.java:246) - proce= ssing substream token: PDFOperator{TJ}
2014-05-14 17:29:45,664 DEBUG [http-bio-8080-exec-5] (Encoding.java:242) - = No character for name space
2014-05-14 17:29:45,665 DEBUG [http-bio-8080= -exec-5] (TTFGlyph2D.java:227) - ABCDEE+Calibri: Glyph not found:3

I tried changing the fonts in Word and also tried using CutePDF = to generate the PDF document, but still see the wrong output. Our applicati= on receives pdfs from different sources and we have no control as to how th= e pdf is generated.

Here is the snippet of the code I use.

=C2=A0=C2=A0= =C2=A0 =C2=A0=C2=A0=C2=A0 PDDocument pdf =3D PDDocument.load(orginalFileNam= e);
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 PDFRenderer renderer =3D new P= DFRenderer(pdf);
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 BufferedImage ima= ge =3D renderer.renderImageWithDPI(0, 96);

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ByteArrayOutputStream baos = =3D new ByteArrayOutputStream();
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 I= mageIO.write( image, "jpg", baos );
=C2=A0=C2=A0=C2=A0 =C2=A0= =C2=A0=C2=A0 baos.flush();
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 baos.cl= ose();
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 pdf.close();
=C2=A0=C2= =A0=C2=A0 =C2=A0=C2=A0=C2=A0 return baos;

Please let me know if there is something I'm missing or = if I should be using a different method to create the image. The pdf that I= use is attached.

Thanks in advance,
Divya =

--089e0160c2725b32d604f970e328-- --089e0160c2725b32d904f970e32a--