pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Klaric <ikla...@gmail.com>
Subject Re: PDFBox 2.0.0 and UTF8 chars
Date Tue, 10 Mar 2015 21:54:54 GMT
OK, I can confirm it works like a charm with Maven. I haven't used Maven
before and I saw build.xml and just kinda concluded that it should
obviously work with Ant.

Thanks again!

On Sat, Mar 7, 2015 at 9:14 AM Ivan Klaric <iklaric@gmail.com> wrote:

> Thank you all, will look into it and report back as soon as I do.
>
> On Thu, Mar 5, 2015 at 8:45 PM John Hewson <john@jahewson.com> wrote:
>
>> Our ant build is obsolete, use maven instead:
>>
>> mvn clean install
>>
>> I don’t know what the status of the ant build is, if perhaps it will be
>> removed? We don’t really need two build systems...
>>
>> — John
>>
>> > On 4 Mar 2015, at 11:23, Ivan Klaric <iklaric@gmail.com> wrote:
>> >
>> > OK, I'll focus on the PDType0Font.load version then. I build my pdfbox
>> like
>> > this:
>> >
>> > svn update && ant clean && ant build
>> >
>> > and then copy fontbox-2.0.0.jar & pdfbox-2.0.0.jar from the target
>> folder
>> > to my projects lib folder. This stack trace:
>> > java.io.IOException: Error: Could not find referenced cmap stream
>> Identity-H
>> > at org.apache.fontbox.cmap.CMapParser.getExternalCMap(CMapParse
>> r.java:418)
>> > at org.apache.fontbox.cmap.CMapParser.parsePredefined(CMapParse
>> r.java:84)
>> > at org.apache.pdfbox.pdmodel.font.CMapManager.getPredefinedCMap
>> (CMapManager.
>> > java:54)
>> > at org.apache.pdfbox.pdmodel.font.PDType0Font.readEncoding(
>> > PDType0Font.java:159)
>> > at org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(PDType0Fon
>> t.java:119)
>> > at org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:59)
>> > at com.company.Main.main(Main.java:20)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke(
>> > NativeMethodAccessorImpl.java:62)
>> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> > DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:483)
>> > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
>> >
>> > is what I get when using PDType0Font.load() and the jars I get out of
>> the
>> > ant. When I use pdfbox.jar from
>> > https://repository.apache.org/content/groups/snapshots/org/a
>> pache/pdfbox/pdfbox/2.0.0-SNAPSHOT/
>> >
>> > I get the same error. Note that there is no fontbox.jar in that folder,
>> so
>> > I get an exception that points to the fact that fontbox.jar is missing:
>> > Exception in thread "main" java.lang.NoClassDefFoundError:
>> > org/apache/fontbox/ttf/TTFParser
>> > at
>> > org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.buildFontFile2(
>> TrueTypeEmbedder.java:90)
>> > at
>> > org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.<init>(TrueT
>> ypeEmbedder.java:72)
>> > at
>> > org.apache.pdfbox.pdmodel.font.PDTrueTypeFontEmbedder.<init>
>> (PDTrueTypeFontEmbedder.java:56)
>> > at
>> > org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.<init>(PDTrueT
>> ypeFont.java:184)
>> > at
>> > org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.loadTTF(PDTrue
>> TypeFont.java:81)
>> > at com.company.Main.main(Main.java:20)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>> > at
>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:483)
>> > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
>> > Caused by: java.lang.ClassNotFoundException:
>> > org.apache.fontbox.ttf.TTFParser
>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> > at java.security.AccessController.doPrivileged(Native Method)
>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> > ... 11 more
>> >
>> >
>> > When I use the fontbox.jar I got from the ant build above, the error is
>> > exactly the same as when using the jars I built with ant. Also, note
>> that
>> > it seems to work fine when I remove the croatian characters from the
>> string
>> > and add e.g. german ones (stuff like *ü, ä, ß...).*
>> >
>> > Thanks,
>> > Ivan
>> >
>> > On Wed, Mar 4, 2015 at 7:07 PM John Hewson <john@jahewson.com> wrote:
>> >
>> >> Hi,
>> >>
>> >>> On 28 Feb 2015, at 02:52, Ivan Klaric <iklaric@gmail.com> wrote:
>> >>>
>> >>> Hello good PDFBox people,
>> >>>
>> >>> I am working on a pet project with PDFBox and I encountered what
>> seems to
>> >>> be an issue with UTF8 chars. If you take the following standard
>> example:
>> >>>
>> >>>   public static void main(String[] args) {
>> >>>       try {
>> >>>           PDDocument document = new PDDocument();
>> >>>           PDPage page = new PDPage();
>> >>>           document.addPage( page );
>> >>>           PDFont font = PDTrueTypeFont.loadTTF(document, new
>> >>> File("res/Roboto-Regular.ttf"));
>> >>>           PDPageContentStream contentStream = null;
>> >>>           contentStream = new PDPageContentStream(document, page);
>> >>>           contentStream.beginText();
>> >>>           contentStream.setFont( font, 12 );
>> >>>           contentStream.moveTextPositionByAmount( 100, 700 );
>> >>>           contentStream.drawString( "Hello World čćžšđČĆŽŠĐ"
);
>> >>>           contentStream.endText();
>> >>>           contentStream.close();
>> >>>           document.save( "/tmp/HelloWorld.pdf");
>> >>>           document.close();
>> >>>
>> >>>       } catch (IOException e) {
>> >>>           e.printStackTrace();
>> >>>       }
>> >>>   }
>> >>>
>> >>> (those weird characters in the drawString method are some pretty
>> common
>> >>> croatian letters). This is what I get:
>> >>> java.io.IOException: Error: Could not find referenced cmap stream
>> >> Identity-H
>> >>> at org.apache.fontbox.cmap.CMapParser.getExternalCMap(
>> >> CMapParser.java:418)
>> >>> at org.apache.fontbox.cmap.CMapParser.parsePredefined(
>> >> CMapParser.java:84)
>> >>> at
>> >>> org.apache.pdfbox.pdmodel.font.CMapManager.
>> >> getPredefinedCMap(CMapManager.java:54)
>> >>> at
>> >>> org.apache.pdfbox.pdmodel.font.PDType0Font.readEncoding(
>> >> PDType0Font.java:159)
>> >>> at org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(
>> >> PDType0Font.java:119)
>> >>> at org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.
>> java:59)
>> >>> at com.company.Main.main(Main.java:20)
>> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >>> at
>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> >> NativeMethodAccessorImpl.java:62)
>> >>> at
>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> >> DelegatingMethodAccessorImpl.java:43)
>> >>> at java.lang.reflect.Method.invoke(Method.java:483)
>> >>> at com.intellij.rt.execution.application.AppMain.main(AppMain.
>> java:134)
>> >>
>> >> There’s something wrong with how you built PDFBox, as this error means
>> it
>> >> can’t find
>> >> resources which we ship in the jar file. Try doing a "mvn clean
>> install”
>> >> or using a snapshot
>> >> jar instead. (Did you build using an IDE or Ant perhaps?)
>> >>
>> >>> Am I doing something wrong? I took the Roboto-Regular font here:
>> >>> http://www.fontsquirrel.com/fonts/roboto
>> >>>
>> >>> If I remove the weird Croatian characters, the error remains the same.
>> >>> However, if I use the PDTrueTypeFont.loadTTF() (which seems to be
>> >>> deprecated) the same thing works without the Croatian characters. If
I
>> >> put
>> >>> the Croatian characters back in (and use PDTrueTypeFont), I get
>> >>>
>> >>> Exception in thread "main" java.lang.IllegalArgumentException:
>> U+010D is
>> >>> not available in this font's Encoding
>> >>> at
>> >>> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.encode(
>> >> PDTrueTypeFont.java:261)
>> >>> at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:268)
>> >>> at
>> >>> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(
>> >> PDPageContentStream.java:316)
>> >>> at
>> >>> org.apache.pdfbox.pdmodel.PDPageContentStream.drawString(
>> >> PDPageContentStream.java:282)
>> >>> at com.company.Main.main(Main.java:25)
>> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >>> at
>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> >> NativeMethodAccessorImpl.java:62)
>> >>> at
>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> >> DelegatingMethodAccessorImpl.java:43)
>> >>> at java.lang.reflect.Method.invoke(Method.java:483)
>> >>> at com.intellij.rt.execution.application.AppMain.main(AppMain.
>> java:134)
>> >>
>> >> PDTrueTypeFont.loadTTF is deprecated and only supports ANSI. For full
>> >> Unicode support,
>> >> use PDType0Font.load, as explained in the @deprecated JavaDoc tag.
>> >>
>> >>> I manually looked into the font file and it seems to contain the
>> U+010D
>> >>> character. What am I doing wrong here?
>> >>>
>> >>> Thanks,
>> >>> Ivan
>> >>
>> >> — John
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> >> For additional commands, e-mail: users-help@pdfbox.apache.org
>> >>
>> >>
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message