pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Malcolm Vincent <malcolmvinc...@gmail.com>
Subject Adobe InDesign PDF
Date Thu, 09 Nov 2017 08:53:06 GMT

I've been using PDFBox to read and write PDFs successfully for a while
and have started running into a few issues recently.

I seem to be getting the following errors when loading PDFs generated
in Adobe InDesign / Acrobat Distiller (the PDFs render fine in Acrobat
Reader, pdf.js and chrome).

The first one seems to be a UI thing for the PDFReader function so I'm
ignoring it.

The second and third are the problem. They are both related. I get
them when I use PDFBox in my own code as well as in the app, but since
they are warnings they do not flag up as runtime errors I can catch.

Nov 09, 2017 8:31:45 AM java.util.prefs.WindowsPreferences <init>
WARNING: Could not open/create prefs root node Software\JavaSoft\Prefs
at root 0x80000002. Windows RegCreateKeyEx(...) returned error code 5.

Nov 09, 2017 8:32:03 AM org.apache.pdfbox.pdfparser.BaseParser
WARNING: Bad Dictionary Declaration

Nov 09, 2017 8:32:03 AM org.apache.pdfbox.pdfparser.BaseParser
WARNING: Invalid dictionary, found: '?' but expected: '/' at offset 2861

I have traced the problem to the following PDF content at the end of
Page 1 Stream 1.

/Span <</Lang (en-GB)/MCID 8 >>BDC
9 0 0 9 99.3376 555.6879 Tm
(text string here)Tj
/Span <</Lang

The last dictionary entry seems to be incomplete.

When I go on to process the files in my own code, I iterate over the
content stream, perform my function and replace the stream content,
the stream ends up incorrect and the resulting PDFs will not load in
Acrobat Reader (although they do in chrome).

My options appear to be

(a) grep the file for this and remove or overwrite it with a string
operation before using PDFBox

(b) update the source to cope with this condition

(c) kick the PDF back as invalid - difficult since the file is a
"valid" PDF that is generated in Adobe and reads ok in Adobe

I have verified this by manually overtyping <</Lang with spaces and
then everything works perfectly in my own code and in PDFReader.

Any thoughts?

Best wishes,

To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message