pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bastian Preindl (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PDFBOX-1001) TextPosition.getHeight() returns erroneous value for some PDFs
Date Tue, 16 Aug 2011 10:20:27 GMT

     [ https://issues.apache.org/jira/browse/PDFBOX-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Bastian Preindl updated PDFBOX-1001:

    Attachment: dreher.pdf

Andreas Lehmkühler asked me to upload an example PDF for the bug described. You find it attached.

> TextPosition.getHeight() returns erroneous value for some PDFs
> --------------------------------------------------------------
>                 Key: PDFBOX-1001
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1001
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.4.0, 1.5.0
>         Environment: Solaris, WinXP
>            Reporter: Emil Wacker
>         Attachments: dreher.pdf
> For a PDF that worked fine under 1.2.1 the height value returned is negative and the
wrong value (i.e. using Math.abs()  won't fix it).  Other PDFs work fine.
> PDF Debug shows "Creator:Crystal Reports"  and "Producer:PDF-XChange (XCPRO30.DLL v3.30.0064)
(Windows 2k)"
> And when examining the 'Stream' items, the text is not what displays.
> Any suggestions on what to look for so that I can do differential analysis against other
PDFs to see what they do/not have in common with this one?
> (It's client data so I can't post the PDF. )
> It's stopping us from moving off 1.2.1  (and later versions fix another issue we have
of seeing question marks instead of the actual characters).

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message