pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Choosing a font for non-ASCII characters
Date Sun, 03 Mar 2019 14:07:51 GMT
Am 03.03.2019 um 14:46 schrieb Christopher Schultz:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Tilman,
>
> On 3/2/19 10:00, Tilman Hausherr wrote:
>> Am 02.03.2019 um 15:54 schrieb Christopher Schultz:
>>> Is there a good way to probe text to determine whether or not an
>>> alternate font will be necessary and only load/bundle it then?
>>  From the new EmbeddedMultipleFonts.java example (in the source
>> code download):
>>
>>
>> boolean isWinAnsiEncoding(int unicode) { String name =
>> GlyphList.getAdobeGlyphList().codePointToName(unicode); if
>> (".notdef".equals(name)) { return false; } return
>> WinAnsiEncoding.INSTANCE.contains(name); }
>>
>>
>> When that one returns true, you can use the built-in fonts.
> Okay, I see that. Is there any reason not to do this?
>
>      boolean isWinAnsiEncoding(int unicode)
>      {
>          return WinAnsiEncoding.INSTANCE.contains(unicode);
>      }
>
> ?

I haven't tried because the numbers are not unicode. Some match, some 
don't. These "codes" are the codes used in the PDF.

>
> Is there nothing like PDFont.isSupportedCodePoint(unicode) available?


No. I agree that it is sortof annoying the way it is done now, but for 
some reason it hasn't been improved. Maybe because each of the font 
types has a different approach to find out whether it works... until 
then, catching IllegalArgumentException is the way to go.

Did you get your application to work, or should PDFBox be redesigned first?


> I didn't see anything. It looks more like the standard way to check is t
> o:
>
> try {
>    page.showText(text);
> } catch (IllegalArgumentException iae) {
>    page.setFont(alternateFont);
>    page.showText(text);
> }

You can also call encode instead, as done in the example.


>
> If that's SOP, then maybe there is no real reason to bother checking
> whether the String will work in the first place... just try it and try
> again if the operation fails?
>
> Catching IllegalArgumentException seems ugly, though. Maybe PDFBox
> could subclass IllegalArgumentException with something more narrow
> like IllegalCodePointException and throw that instead? It would be
> backward-compatible and also one could determine the root cause
> without parsing the exception message to see what the problem was.

IIRC it's always that cause if it throws.

Tilman


>
> I'm happy to provide a patch.
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlx72rsACgkQHPApP6U8
> pFho0Q//cH4SX5tWsb/JX782EJ622/h3XCumnrWuMT/yiunSyinsd26Jz3tquxU9
> /tL9hZ8a57j20dKoqf5vm8EorlpYBrSgNAOjlRxuKqY2CLdnA9EsWX9Uux7R5PjF
> FUeE8yKGRyycUBazfNm0Ijv4oZt7A26/irmZrKUwbx73gbIxJMggFGQoMiAWMwgM
> hoX4MeJiBdxmJYf/XnHVZJs1LBX9pDnizIHEU26/bK7B2wb3H2+PSWe4TKf0eb7v
> n1UVjX+12U+CzlF9kx4AnMSDaTo3zmCxSQbzygOqVmaQsc2yAk7mksb7Tt79JzZ/
> s1aatZRtmLEuRhbrF8knt3oWlat4Z1KKQD/Onol3pX+CQ/vKVmFgp9TLBitkiOm+
> CZC949jfg3386akxeixQxBNLxMoo826NYfNLzKb6x0rYSnz4mgqyrvEPzEw/CltT
> Sn7Fo5RSvMH1aCa45KoPmQzCE0okUQN74XaqGaob6pFuerlHcYxhS/DefP+QtO93
> ZRxWyGMJMw81+AEk7eIBeLVxh4gTCdA2bOJwR4I4n5oJZi0VCXOLy8p6wBlQrvDx
> rtRhcHW/HidVeiOeQ9kYoEDqAbg6Rvc4Wi/TkM0LxgeV0d/D9YW+gUWFw3NyiiNk
> IONjKQBxKpowgzXsq0Ug/DcKGu/Za7De9tp0jD5MVZU9i3e96Ag=
> =bMpZ
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message