pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johanneke Lamberink (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PDFBOX-3646) Annotations parsed from XFDF containing ampersand characters are not properly imported
Date Thu, 15 Nov 2018 18:19:00 GMT

    [ https://issues.apache.org/jira/browse/PDFBOX-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16688466#comment-16688466

Johanneke Lamberink commented on PDFBOX-3646:

[~tilman] I switched jobs last year and no longer work with PDFBox. Kudos on fixing the issue
though :)

> Annotations parsed from XFDF containing ampersand characters are not properly imported
> --------------------------------------------------------------------------------------
>                 Key: PDFBOX-3646
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3646
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm, PDModel
>    Affects Versions: 2.0.3, 2.0.4, 2.0.5, 2.0.6
>         Environment: java 1.8.0_112
>            Reporter: Kai Keggenhoff
>            Assignee: Tilman Hausherr
>            Priority: Major
>              Labels: xfdf
>             Fix For: 2.0.13, 3.0.0 PDFBox
>         Attachments: MergeTest.java, output1.pdf, output2.pdf, sample.xfdf
> Annotations containing "&" in their text are displayed incorrectly when parsed unmodified
from XFDF (the ampersands are encoded as "&amp;" there) and added to a PDF document.
>  This occurs for both "text comment" and "text box" type annotations.
>  However, if the XFDF is modified by replacing "&amp;" with "&amp;amp;" prior
to parsing, the imported annotations are then displayed correctly.
> The attached code produces two pdf files. One is the PDF with the unmodified XFDF imported,
two the PDF with the modifed XFDF.
> A XFDF containing both a text box and text comment annotation is embedded in the source
and attached as a separated file.
> Update 23.03.2017 : This problem persists in 2.0.5 and we noticed the same corruption
of merged annotations occur, if the annotation text contains a "<" (encoded as "lt" entity)
> Update 17.10.2018 : This corruption is caused by FDFAnnotation.richContentsToString.
This method reads "<" and "&" from the parsed values in the document and puts them
as such into the markup, but these characters must be replaced with their entities.
> I'll add this substitution to my proposed bugfix of 4345, please refer to https://issues.apache.org/jira/projects/PDFBOX/issues/PDFBOX-4345

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org

View raw message