pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: PDFont.getStringWidth() throws IllegalArgumentException
Date Fri, 19 Jun 2015 18:47:42 GMT

> On 19 Jun 2015, at 02:36, Torgeir Veimo <torgeir.veimo@gmail.com> wrote:
> 
> Ok, I guess I need to prefilter the strings. However I can't seem to
> find any reference to what characters are invalid in all pdf fonts.

Any line breaking glyph won’t work with showText() but other than that, any
character is potentially valid, it depends on whether the TTF you’re using
contains those glyphs or not.

> Is there a way to query for what glyphs are included in the font?

You could call the PDFont#encode(text) for each character and catch the
IllegalArgumentException. I know that’s not ideal.

The showText() API is new an we’re open to making changes or improvements,
in particular a lot of users have been confused by the IllegalArgumentException.
The problem is that we don’t really want to generate PDFs with missing glyphs,
so we’d rather catch that and throw an exception. Perhaps a checked exception
would be clearer, e.g. MissingGlyphException?

I was thinking we could still throw an IllegalArgumentException for cases where
there’s a newline in the input, because that’s not supported by PDF.

Thoughts?

— John

> On 19 June 2015 at 15:07, John Hewson <john@jahewson.com> wrote:
>>> On 16 Jun 2015, at 21:17, Torgeir Veimo <torgeir.veimo@gmail.com> wrote:
>>> 
>>> Am trying out PDFBox 2.0.0-SNAPSHOT to get PDF embedding working, but
>>> am getting exceptions with normal font routines.
>>> 
>>> On the string
>>> "fagområder som: • skatterådgivning • regnskapsrådgivning •
>>> generasjonsskifte • selskapsstiftelse og omdanning • verdivurdering •
>>> opplegg av bedriftens interne organisasjon • undervisning •"
>>> 
>>> I'm getting java.lang.IllegalArgumentException: No glyph for U+2028 in
>>> font Lato-MediumItalic.
>> 
>> U+2028 is a line separator, which isn't supported by showText().
>> 
>>> Wouldn't it be more appropriate to assume missing glyphs have zero
>>> width? Is there a way to prevent this from happening?
>> 
>> Missing glyphs are going to appear in a PDF as an outlined rectangle, which is almost
certainly not what you want. We throw an error rather than generating a bad PDF.
>> 
>> You need to handle the line breaking yourself and/or use a font which provides a
glyph for U+2028.
>> 
>> -- John
>> 
>>> --
>>> -Tor
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>> 
> 
> 
> 
> -- 
> -Tor
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message