pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkühler (JIRA) <j...@apache.org>
Subject [jira] [Resolved] (PDFBOX-1770) ExtractText gets all "?" when pdf 's font is instance of PDType1Font
Date Sat, 04 Jan 2014 17:40:52 GMT

     [ https://issues.apache.org/jira/browse/PDFBOX-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andreas Lehmkühler resolved PDFBOX-1770.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 2.0.0
                   1.8.4
         Assignee: Andreas Lehmkühler

I fixed the issue in revisions 1555382 (trunk) and 1555384 (1.8 branch). Both the extraction
of the text and the rendering works well.

It doesn't make sense to use the embedded Type1 font to encode the string if the parent font
hasn't any encoding. I simply removed the delegating call.

Thanks for the report!

> ExtractText gets all "?" when pdf 's font is instance of PDType1Font
> --------------------------------------------------------------------
>
>                 Key: PDFBOX-1770
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1770
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.2
>            Reporter: Sean.Sun
>            Assignee: Andreas Lehmkühler
>             Fix For: 1.8.4, 2.0.0
>
>         Attachments: The Importance of Symmetry.pdf, The Importance of Symmetry.pdf.d2t
>
>
> ExtractText gets all "?" when font is instanceof PDType1Font and subtype is type1CFont
and fontEncoding is null.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message