pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: PDFBox 2.0.0 and UTF8 chars
Date Thu, 05 Mar 2015 19:44:53 GMT
Our ant build is obsolete, use maven instead:

mvn clean install

I don’t know what the status of the ant build is, if perhaps it will be removed? We don’t
really need two build systems...

— John

> On 4 Mar 2015, at 11:23, Ivan Klaric <iklaric@gmail.com> wrote:
> 
> OK, I'll focus on the PDType0Font.load version then. I build my pdfbox like
> this:
> 
> svn update && ant clean && ant build
> 
> and then copy fontbox-2.0.0.jar & pdfbox-2.0.0.jar from the target folder
> to my projects lib folder. This stack trace:
> java.io.IOException: Error: Could not find referenced cmap stream Identity-H
> at org.apache.fontbox.cmap.CMapParser.getExternalCMap(CMapParser.java:418)
> at org.apache.fontbox.cmap.CMapParser.parsePredefined(CMapParser.java:84)
> at org.apache.pdfbox.pdmodel.font.CMapManager.getPredefinedCMap(CMapManager.
> java:54)
> at org.apache.pdfbox.pdmodel.font.PDType0Font.readEncoding(
> PDType0Font.java:159)
> at org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(PDType0Font.java:119)
> at org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:59)
> at com.company.Main.main(Main.java:20)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
> 
> is what I get when using PDType0Font.load() and the jars I get out of the
> ant. When I use pdfbox.jar from
> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/2.0.0-SNAPSHOT/
> 
> I get the same error. Note that there is no fontbox.jar in that folder, so
> I get an exception that points to the fact that fontbox.jar is missing:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/fontbox/ttf/TTFParser
> at
> org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.buildFontFile2(TrueTypeEmbedder.java:90)
> at
> org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.<init>(TrueTypeEmbedder.java:72)
> at
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFontEmbedder.<init>(PDTrueTypeFontEmbedder.java:56)
> at
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.<init>(PDTrueTypeFont.java:184)
> at
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.loadTTF(PDTrueTypeFont.java:81)
> at com.company.Main.main(Main.java:20)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.fontbox.ttf.TTFParser
> at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 11 more
> 
> 
> When I use the fontbox.jar I got from the ant build above, the error is
> exactly the same as when using the jars I built with ant. Also, note that
> it seems to work fine when I remove the croatian characters from the string
> and add e.g. german ones (stuff like *ü, ä, ß...).*
> 
> Thanks,
> Ivan
> 
> On Wed, Mar 4, 2015 at 7:07 PM John Hewson <john@jahewson.com> wrote:
> 
>> Hi,
>> 
>>> On 28 Feb 2015, at 02:52, Ivan Klaric <iklaric@gmail.com> wrote:
>>> 
>>> Hello good PDFBox people,
>>> 
>>> I am working on a pet project with PDFBox and I encountered what seems to
>>> be an issue with UTF8 chars. If you take the following standard example:
>>> 
>>>   public static void main(String[] args) {
>>>       try {
>>>           PDDocument document = new PDDocument();
>>>           PDPage page = new PDPage();
>>>           document.addPage( page );
>>>           PDFont font = PDTrueTypeFont.loadTTF(document, new
>>> File("res/Roboto-Regular.ttf"));
>>>           PDPageContentStream contentStream = null;
>>>           contentStream = new PDPageContentStream(document, page);
>>>           contentStream.beginText();
>>>           contentStream.setFont( font, 12 );
>>>           contentStream.moveTextPositionByAmount( 100, 700 );
>>>           contentStream.drawString( "Hello World čćžšđČĆŽŠĐ" );
>>>           contentStream.endText();
>>>           contentStream.close();
>>>           document.save( "/tmp/HelloWorld.pdf");
>>>           document.close();
>>> 
>>>       } catch (IOException e) {
>>>           e.printStackTrace();
>>>       }
>>>   }
>>> 
>>> (those weird characters in the drawString method are some pretty common
>>> croatian letters). This is what I get:
>>> java.io.IOException: Error: Could not find referenced cmap stream
>> Identity-H
>>> at org.apache.fontbox.cmap.CMapParser.getExternalCMap(
>> CMapParser.java:418)
>>> at org.apache.fontbox.cmap.CMapParser.parsePredefined(
>> CMapParser.java:84)
>>> at
>>> org.apache.pdfbox.pdmodel.font.CMapManager.
>> getPredefinedCMap(CMapManager.java:54)
>>> at
>>> org.apache.pdfbox.pdmodel.font.PDType0Font.readEncoding(
>> PDType0Font.java:159)
>>> at org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(
>> PDType0Font.java:119)
>>> at org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:59)
>>> at com.company.Main.main(Main.java:20)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> NativeMethodAccessorImpl.java:62)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:483)
>>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
>> 
>> There’s something wrong with how you built PDFBox, as this error means it
>> can’t find
>> resources which we ship in the jar file. Try doing a "mvn clean install”
>> or using a snapshot
>> jar instead. (Did you build using an IDE or Ant perhaps?)
>> 
>>> Am I doing something wrong? I took the Roboto-Regular font here:
>>> http://www.fontsquirrel.com/fonts/roboto
>>> 
>>> If I remove the weird Croatian characters, the error remains the same.
>>> However, if I use the PDTrueTypeFont.loadTTF() (which seems to be
>>> deprecated) the same thing works without the Croatian characters. If I
>> put
>>> the Croatian characters back in (and use PDTrueTypeFont), I get
>>> 
>>> Exception in thread "main" java.lang.IllegalArgumentException: U+010D is
>>> not available in this font's Encoding
>>> at
>>> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.encode(
>> PDTrueTypeFont.java:261)
>>> at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:268)
>>> at
>>> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(
>> PDPageContentStream.java:316)
>>> at
>>> org.apache.pdfbox.pdmodel.PDPageContentStream.drawString(
>> PDPageContentStream.java:282)
>>> at com.company.Main.main(Main.java:25)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> NativeMethodAccessorImpl.java:62)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:483)
>>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
>> 
>> PDTrueTypeFont.loadTTF is deprecated and only supports ANSI. For full
>> Unicode support,
>> use PDType0Font.load, as explained in the @deprecated JavaDoc tag.
>> 
>>> I manually looked into the font file and it seems to contain the U+010D
>>> character. What am I doing wrong here?
>>> 
>>> Thanks,
>>> Ivan
>> 
>> — John
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message