pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: Choosing a font for non-ASCII characters
Date Wed, 20 Mar 2019 15:00:04 GMT
Hash: SHA256


On 3/20/19 03:55, Tilman Hausherr wrote:
> Am 19.03.2019 um 22:08 schrieb Christopher Schultz: Tilman,
> On 3/19/19 16:23, Tilman Hausherr wrote:
>>>> Am 19.03.2019 um 19:45 schrieb Christopher Schultz: Tilman,
>>>> So I'm starting to look toward making my code better now that
>>>> it's actually working. Right now, my code looks like this:
>>>> if(!isAnsiEncoding(strippedText)) { font =
>>>> getFullUnicodeFont(); }
>>>> Where one font simply replaces the other for strings that
>>>> aren't available the the built-in font(s).
>>>> I'd like to support emoji and stuff like that. I can find a
>>>> font (or fonts) for that, but I think the only way I can do
>>>> that with the existing API is something like this:
>>>> Font[] fonts = new Font[] { builtIn, arialUnicode, emoji };
>>>> for(Font font : fonts) { try { page.setFont(font); 
>>>> page.showText(text); } catch (IllegalArgumentException iae) {
>>>> // Try the next font } }
>>>> That will "work" but it will not work if, for example, I need
>>>> to print text that includes both Chinese characters (from
>>>> arialUnicode font) and also emoji (from the hypothetical
>>>> "emoji" font).
>>>> If there any way to tell PDFBox to "pick the right font (from
>>>> some list) for each character"?
>>>>> No, that is why I created the EmbeddedMultipleFonts.java
>>>>> example which I mentioned earlier in the thread. That one
>>>>> can switch within strings.
> Right, it basically does the same thing as I have above, but for a 
> bunch of increasingly-widening substrings, and it uses exceptions
> for flow control. Yuck.
> I'd have to look more into what PDFont.encode does, but I'm
> guessing that it wouldn't be too hard to build methods into the
> PDFFont class that look something like this:
> /** * Returns true if this PDFont can render the whole string. */ 
> public boolean canEncode(String s);
> /** * Returns the longest String that can be successfully encoded
> by this * PDFont, beginning at the beginning of {s}. If the whole
> String {s} * is encodable, then {s} will be returned. If only a
> part of {s} * is encodable, then the return value of this method
> will be such that: * *
> s.startsWith(getLongestEncodablePrefix(s)) == true * * * If the
> first character of the string is not encodable in this PDFont, * an
> empty string (or null?) will be returned. */ public String
> getLongestEncodablePrefix(String s);
>> That would just push what you called "Yuck" further downwards, or
>> we would have to maintain code twice, one for checking whether
>> something can encoded, and one for actually doing it. And this
>> for all the 6, maybe 7 font types.

Code reuse?

>> Instead of going forward with your project with the working code 
>> provided, you're arguing about design issues.

You are operating under the impression that I haven't already modified
my own code to work. I have.

I'm volunteering to help improve your product. You don't have to get
so upset when someone offers help.

- -chris
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/


To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message