pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hill <DH...@StudentLoan.org>
Subject RE: {External} Re: upgrading to 2.x - package private class PDFieldFactory?
Date Mon, 01 Aug 2016 13:54:45 GMT

The bug does still exist, hence my continued need to override the class in question.

"Subclassing huge swathes of internal code" gets the job done without waiting for a future
release. Making it harder for users to work around issues might increase patch submission
but doesn't sound like the best overall solution.

The bug is that when merging two PDFs, if the first PDF doesn't have any tags and the second
one does, tags are lost. Tags are a critical feature for 508 compliance.

I don't have a patch, and I haven't fixed the latest version yet, but from version 1.8.6 your
code from PDFMergerUtility.appendDocument line 327 that looks like this

            PDAcroForm destAcroForm = destCatalog.getAcroForm();
            PDAcroForm srcAcroForm = srcCatalog.getAcroForm();
            if (destAcroForm == null)

we changed to this

            PDAcroForm destAcroForm = destCatalog.getAcroForm();
            PDAcroForm srcAcroForm = srcCatalog.getAcroForm();
            if (srcAcroForm != null && destAcroForm == null)
                destAcroForm = new PDAcroForm(destination);
                PDResources res = new PDResources();

                mergeAcroForm(cloner, destAcroForm, srcAcroForm);

and line 554 which was

            kDictLevel0.setItem(COSName.K, newKArray);
            kDictLevel0.setItem(COSName.P, destStructTree);
            kDictLevel0.setItem(COSName.S, new COSString(STRUCTURETYPE_DOCUMENT));

we apparently changed to

            //Kids of source document (insert)
            List<Object> addKids = srcStructTree.getKids();
            for (Object k : addKids) {
                ((PDStructureElement) k).setParent(destStructTree);
                destStructTree.appendKid((PDStructureElement) k);

I did not make the original changes and I have not yet attempted to correct version 2.0.2
but without these changes we have failing tests looking for tags and have verified the output
files are missing tags.

Thank you.


From: John Hewson [john@jahewson.com]
Sent: Saturday, July 30, 2016 2:47 AM
To: users@pdfbox.apache.org
Subject: {External}  Re: upgrading to 2.x - package private class PDFieldFactory?

> On 29 Jul 2016, at 06:09, David Hill <DHill@StudentLoan.org> wrote:
> I am upgrading a project from 1.8.6 to the 2.0.2 version of PDFBox and everything went
reasonably well except that we made use of PDFieldFactory
> PDField destField = PDFieldFactory.createField
> Everything in PDFieldFactory used to be public but now it is all package private.
> I tried to copy the code for that method into our project and found that all of the various
PDField objects being created were also package private.
> It looks like the code we have came from overriding and copying PDFMergerUtility.appendDocument
to fix an apparent bug. PDFMergerUtility.appendDocument makes a call to PDFieldFactory.createField.
> I could continue to force my way around this issue but I have to wonder if this was intentional.

Yep. Subclassing huge swathes of internal code isn't a great solution to fixing a bug in an
open source project. We'd happily take a patch for the bug if it still exists.

-- John

> Dave


This e-mail and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. If you have received this e-mail
in error please notify the originator of the message. This footer also confirms that this
e-mail message has been scanned for the presence of computer viruses. Any views expressed
in this message are those of the individual sender, except where the sender specifies and
with authority, states them to be the views of Iowa Student Loan.

To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message