pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Petr Slaby (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PDFBOX-2108) Type0 CFF Font with identity encoding rendered incorrectly
Date Tue, 03 Jun 2014 09:46:02 GMT
Petr Slaby created PDFBOX-2108:

             Summary: Type0 CFF Font with identity encoding rendered incorrectly
                 Key: PDFBOX-2108
                 URL: https://issues.apache.org/jira/browse/PDFBOX-2108
             Project: PDFBox
          Issue Type: Bug
          Components: FontBox, Rendering
    Affects Versions: 2.0.0
            Reporter: Petr Slaby

The attached pdf files were created in iText. Both print two lines containing the text "Hello
World!". The second line is printed using a Type0 CFF font with encoding Identity-H. One of
the documents uses a western languages font (font0.pdf). The other one (font0-arabic.pdf)
uses an Arabic font, but this font contains latin letters, too. Expected rendering can be
seen in Acrobat, the wrong rendering from PDFBox created via PDFToImage is attached.

I have experimented with the problem and tried to fix it, proposed changes are attached in
Type0FontEncoding.patch. However, I came to the result just by experimenting, without really
reading the PDF specification. Also, I am getting lost in the many mappings internally used
in PDFBox, so an expert view is for sure required.

Note: I have seen several issues where font encoding is already discussed, but I was not able
to decide whether this exact problem is already described in one of them. Sorry, if this is
a duplicate.

Note: Even after fixing the encoding problem, the font size of the second line in the "Arabic"
example is wrong for some reason. This is probably a different issue and I did not try to
analyze it yet.

This message was sent by Atlassian JIRA

View raw message