pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkuehler <andr...@lehmi.de>
Subject Re: Supporting multiple languages, including CJK
Date Tue, 18 Oct 2016 18:20:52 GMT
Am 18.10.2016 um 15:32 schrieb Daniel King:
> I'm curious why you shouldn't load fonts that are scanned in by PDFBox using org.apache.fontbox.util.autodetect.FontDirFinder
and instead reference a hard coded system directory?
As you don't know what you get when asking the FontMapper for "Arial" especially 
if you run your code on different environments or OS.

You may get a simple Arial font with a limited charset, or you may get "Arial 
Unicode MS" which has a wide support for non latin charsets or you may get any 
arial alike font.

IMHO there are to many "may" especially if you are looking for a CJK capable font.

As John already said, it's the best idea to choose the font on your own to be 
sure you get what you are looking for.

BR
Andreas

>
> -----Original Message-----
> From: John Hewson [mailto:john@jahewson.com]
> Sent: Tuesday, October 18, 2016 3:09 AM
> To: users@pdfbox.apache.org
> Subject: Re: Supporting multiple languages, including CJK
>
>
>> On 12 Oct 2016, at 05:24, Daniel King <dking@halogensoftware.com> wrote:
>>
>> Hi,
>>
>> I'm attempting to write text to a PDF in situations where I need to
>> support multiple languages on a single PDF. This may include regular
>> latin characters as well as CJK characters. I've tried many attempts
>> to do this and have it load the character sets from the OS without
>> much success. The farthest I have gotten is support latin characters,
>> some russian and I believe Vietnamese characters founds on the
>> embedded fonts example here
>> https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/
>> apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup
>>
>> I'm doing a similar approach from the example but I believe I'm using
>> the FileSystemFontProvider provided by the FontMappers class by doing
>> something such as
>>
>> TrueTypeFont ttf = FontMappers.instance().getTrueTypeFont("Arial",
>> null).getFont(); PDFont font = PDType0Font.load(signatureDocument,
>> ttf.getOriginalData());
>
> Don’t load fonts like this. Follow the approach from the EmbeddedFonts example and
load them from the filesystem.
>
>> As I mentioned I seem to be able to support the text in the EmbeddedFonts example
but can't seem to determine how I can also support CJK. I’m currently using 2.0.2 of PDFBox
but could potentially upgrade to 2.0.3 if that would help at all.
>
> If you have a font which supports CJK then PDFBox should be able to use it. I recommend
“Arial Unicode MS” as a good starting point, as it provides many more Unicode characters
than plain “Arial”. Google’s Noto fonts also provide a great selection of characters.
>
> — John
>
>> Thanks for the help,
>> Dan
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message