pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkühler (JIRA) <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (PDFBOX-1812) Illegal characters in XML output
Date Thu, 02 Jan 2014 10:26:51 GMT

     [ https://issues.apache.org/jira/browse/PDFBOX-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andreas Lehmkühler updated PDFBOX-1812:

    Comment: was deleted

(was: I'm currently away and unable to respond to your message. I will be back the 2nd of

Best regards,


        [Koninklijke Bibliotheek, National Library of the Netherlands] <http://www.kb.nl>
        Prins Willem-Alexanderhof 5 | 2595 BE Den Haag
Postbus 90407 | 2509 LK Den Haag | (070) 314 09 11 | www.kb.nl<http://www.kb.nl/>
        English version<http://www.kb.nl/red/email.html> | Disclaimer<http://www.kb.nl/red/disclaimer.html>


> Illegal characters in XML output
> --------------------------------
>                 Key: PDFBOX-1812
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1812
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Preflight
>    Affects Versions: 2.0.0
>         Environment: Bug reproduced under Win 7, Ubuntu
>            Reporter: Johan van der Knijff
>              Labels: characters, utf-8, xml
>             Fix For: 2.0.0
>         Attachments: 013814.pdf, 013814.xml, 013814_old.xml, 598659.pdf, 598659.xml,
598659_old.xml, 600111.pdf, 600111.xml, 600111_old.xml, preflight-app.jar
> When running Preflight in XML mode, the latest Preflight version (I used the JAR from
build #747) sometimes produces output that contains characters that are illegal in XML. This
can cause unexpected behavior if such files are further processed with tools that expect well-formed
XML.  See attached PDFs, which all result in illegal characters in the description of a 1.0
Syntax error, Error: Expected a long type. Output of older versions of Preflight didn't contain
these illegal characters; instead they would give something like *actual='/O'*, *actual='Pages'*.
etc. So I suppose this must have been caused by a fairly recent change.
> See attachments below.

This message was sent by Atlassian JIRA

View raw message