pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkühler <andr...@lehmi.de>
Subject Re: PDFBox 2.0.0 and UTF8 chars
Date Mon, 02 Mar 2015 12:04:28 GMT
Hi

> Tilman Hausherr <THausherr@t-online.de> hat am 1. März 2015 um 19:54
> geschrieben:
> 
> 
> Heh heh, I wanted to make a similar comment, but then I saw the stack 
> trace showing that he did just that...
Ups. you are right. The stack trace doesn't belong to the listed code. So, most
likely thers is an issue with that specific font. Either a malformed font or a
fontbox issue.

BR
Andreas Lehmkühler
> 
> Tilman
> 
> Am 01.03.2015 um 18:53 schrieb Andreas Lehmkuehler:
> > Hi,
> >
> > Am 28.02.2015 um 11:52 schrieb Ivan Klaric:
> >> Hello good PDFBox people,
> >>
> >> I am working on a pet project with PDFBox and I encountered what 
> >> seems to
> >> be an issue with UTF8 chars. If you take the following standard example:
> >>
> >>      public static void main(String[] args) {
> >>          try {
> >>              PDDocument document = new PDDocument();
> >>              PDPage page = new PDPage();
> >>              document.addPage( page );
> >>              PDFont font = PDTrueTypeFont.loadTTF(document, new
> >> File("res/Roboto-Regular.ttf"));
> >
> > Try to load the TTF font as a Type0 font
> >
> > PDFont font = PDType0Font.load(document, new 
> > File("res/Roboto-Regular.ttf"));
> >
> > BR
> > Andreas Lehmkühler
> >
> >>              PDPageContentStream contentStream = null;
> >>              contentStream = new PDPageContentStream(document, page);
> >>              contentStream.beginText();
> >>              contentStream.setFont( font, 12 );
> >>              contentStream.moveTextPositionByAmount( 100, 700 );
> >>              contentStream.drawString( "Hello World čćžšđČĆŽŠĐ" );
> >>              contentStream.endText();
> >>              contentStream.close();
> >>              document.save( "/tmp/HelloWorld.pdf");
> >>              document.close();
> >>
> >>          } catch (IOException e) {
> >>              e.printStackTrace();
> >>          }
> >>      }
> >>
> >> (those weird characters in the drawString method are some pretty common
> >> croatian letters). This is what I get:
> >> java.io.IOException: Error: Could not find referenced cmap stream 
> >> Identity-H
> >> at 
> >> org.apache.fontbox.cmap.CMapParser.getExternalCMap(CMapParser.java:418)
> >> at 
> >> org.apache.fontbox.cmap.CMapParser.parsePredefined(CMapParser.java:84)
> >> at
> >> org.apache.pdfbox.pdmodel.font.CMapManager.getPredefinedCMap(CMapManager.java:54)
> >> 
> >>
> >> at
> >> org.apache.pdfbox.pdmodel.font.PDType0Font.readEncoding(PDType0Font.java:159)
> >> 
> >>
> >> at 
> >> org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(PDType0Font.java:119)
> >> at org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:59)
> >> at com.company.Main.main(Main.java:20)
> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> at
> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >> 
> >>
> >> at
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> 
> >>
> >> at java.lang.reflect.Method.invoke(Method.java:483)
> >> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
> >>
> >>
> >> Am I doing something wrong? I took the Roboto-Regular font here:
> >> http://www.fontsquirrel.com/fonts/roboto
> >>
> >> If I remove the weird Croatian characters, the error remains the same.
> >> However, if I use the PDTrueTypeFont.loadTTF() (which seems to be
> >> deprecated) the same thing works without the Croatian characters. If 
> >> I put
> >> the Croatian characters back in (and use PDTrueTypeFont), I get
> >>
> >> Exception in thread "main" java.lang.IllegalArgumentException: U+010D is
> >> not available in this font's Encoding
> >> at
> >> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.encode(PDTrueTypeFont.java:261)
> >> 
> >>
> >> at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:268)
> >> at
> >> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:316)
> >> 
> >>
> >> at
> >> org.apache.pdfbox.pdmodel.PDPageContentStream.drawString(PDPageContentStream.java:282)
> >> 
> >>
> >> at com.company.Main.main(Main.java:25)
> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> at
> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >> 
> >>
> >> at
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> 
> >>
> >> at java.lang.reflect.Method.invoke(Method.java:483)
> >> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
> >>
> >> I manually looked into the font file and it seems to contain the U+010D
> >> character. What am I doing wrong here?
> >>
> >> Thanks,
> >> Ivan
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message