pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: Supporting multiple languages, including CJK
Date Tue, 18 Oct 2016 07:08:50 GMT

> On 12 Oct 2016, at 05:24, Daniel King <dking@halogensoftware.com> wrote:
> Hi,
> I'm attempting to write text to a PDF in situations where I need to support multiple
languages on a single PDF. This may include regular latin characters as well as CJK characters.
I've tried many attempts to do this and have it load the character sets from the OS without
much success. The farthest I have gotten is support latin characters, some russian and I believe
Vietnamese characters founds on the embedded fonts example here https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup
> I'm doing a similar approach from the example but I believe I'm using the FileSystemFontProvider
provided by the FontMappers class by doing something such as
> TrueTypeFont ttf = FontMappers.instance().getTrueTypeFont("Arial", null).getFont();
> PDFont font = PDType0Font.load(signatureDocument, ttf.getOriginalData());

Don’t load fonts like this. Follow the approach from the EmbeddedFonts example and load
them from the filesystem.

> As I mentioned I seem to be able to support the text in the EmbeddedFonts example but
can't seem to determine how I can also support CJK. I’m currently using 2.0.2 of PDFBox
but could potentially upgrade to 2.0.3 if that would help at all.

If you have a font which supports CJK then PDFBox should be able to use it. I recommend “Arial
Unicode MS” as a good starting point, as it provides many more Unicode characters than plain
“Arial”. Google’s Noto fonts also provide a great selection of characters.

— John

> Thanks for the help,
> Dan

To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message