pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel King <dk...@halogensoftware.com>
Subject RE: Supporting multiple languages, including CJK
Date Tue, 18 Oct 2016 13:32:17 GMT
I'm curious why you shouldn't load fonts that are scanned in by PDFBox using org.apache.fontbox.util.autodetect.FontDirFinder
and instead reference a hard coded system directory?
-----Original Message-----
From: John Hewson [mailto:john@jahewson.com] 
Sent: Tuesday, October 18, 2016 3:09 AM
To: users@pdfbox.apache.org
Subject: Re: Supporting multiple languages, including CJK

> On 12 Oct 2016, at 05:24, Daniel King <dking@halogensoftware.com> wrote:
> Hi,
> I'm attempting to write text to a PDF in situations where I need to 
> support multiple languages on a single PDF. This may include regular 
> latin characters as well as CJK characters. I've tried many attempts 
> to do this and have it load the character sets from the OS without 
> much success. The farthest I have gotten is support latin characters, 
> some russian and I believe Vietnamese characters founds on the 
> embedded fonts example here 
> https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/
> apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup
> I'm doing a similar approach from the example but I believe I'm using 
> the FileSystemFontProvider provided by the FontMappers class by doing 
> something such as
> TrueTypeFont ttf = FontMappers.instance().getTrueTypeFont("Arial", 
> null).getFont(); PDFont font = PDType0Font.load(signatureDocument, 
> ttf.getOriginalData());

Don’t load fonts like this. Follow the approach from the EmbeddedFonts example and load
them from the filesystem.

> As I mentioned I seem to be able to support the text in the EmbeddedFonts example but
can't seem to determine how I can also support CJK. I’m currently using 2.0.2 of PDFBox
but could potentially upgrade to 2.0.3 if that would help at all.

If you have a font which supports CJK then PDFBox should be able to use it. I recommend “Arial
Unicode MS” as a good starting point, as it provides many more Unicode characters than plain
“Arial”. Google’s Noto fonts also provide a great selection of characters.

— John

> Thanks for the help,
> Dan

To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message