pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: quotedbl causes NullPointerException
Date Mon, 24 Aug 2015 18:59:17 GMT
Hi Juergen,

Thanks for letting us know about this, the NullPointerException certainly sounds like a PDFBox
bug.
Please open an issue on JIRA (https://issues.apache.org/jira/browse/PDFBOX/ <https://issues.apache.org/jira/browse/PDFBOX/>)
and upload the problem PDF (via More > Attach Files).

Thanks,

— John

> On 24 Aug 2015, at 11:11, Jürgen Uhl <juergenuhl1@gmx.de> wrote:
> 
> I have a pdf document using (besides others) the font CourierNewPS-BoldMT and text with
this font containing a double quote.
> 
> When calling PDFont.encode, this results in a NullPointerException due to the following:
> The font encoding is built using pdf /DIFFERENCES which overwrites the original "quotedbl"
at index 34 with an "A". The entries for quotedblbase/left/right are left unchanged. As a
result, the inverted font does not contain "quotedbl" as key.
> Within encode, the character code 34 gets assigned the name "quotedbl", which is then
not found in the inverse encoding (PDTrueTypeFont.encode -> int code = inverted.get(name))
> Right before this code line causing the NullPointerException, there is a check whether
ttf.hasGlyph("quotedbl") (which in this case is false) and, if not, whether ttf.hasGlyph("uni0022")
(which in this case is true); however, this has no consequence for the continuation of the
code, which then crashes, since inverted.get("quotedbl") is null (which is assigned to an
int).
> I believe, this is a bug in PDFBox, but have no idea, whether the handling within encode
should be changed (maybe using the "else" part in case ttf.hasGlyph("quotedbl") is false or
whether code 34 should be assigned to quotedblbase in the first place, or even something else.
> In any case, I'd of course be eager to learn about ways to circumvent this situation
as a PDFBox user.
> Juergen


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message