pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Melanie Freed <mefr...@gmail.com>
Subject Problem extracting height of Type 3 Font?
Date Mon, 08 Aug 2016 21:45:25 GMT
Hi.  I'm using pdfbox-2.0.2 and am having trouble getting the height of
extracted text from a PDF with Type 3 fonts.

I've been able to successfully get the height for Type 1 fonts by
overriding the writeString function in the PDFTextStripper class and using
the maximum font size in points as the height:

    float height = 0f;
    for (TextPosition textPosition : textPositions)
    {
        height = Math.max(height, textPosition.getFontSizeInPt());
    }

But this doesn't work for Type 3 fonts since they don't use sizes in the
same way.  I tried to use the bounding box like this:

    PDFont font_obj = textPositions.get(0).getFont();
    BoundingBox bbox = font_obj.getBoundingBox();
    float height = bbox.getHeight();

But the results aren't what I would expect.  For example, when I run it on
a document with a Type 1 font, I get a value of 7.0 as the font size in
points (using the first method) and the second method gives me a value of
1156.0.

Am I missing some kind of conversion from units of the bounding box to
points?  Or just approaching this problem in the wrong way?

Any advice would be greatly appreciated!

Thanks in advance,
Melanie

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message