pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkuehler <andr...@lehmi.de>
Subject Re: Signed PDF with non-encrypted headers causes issue in PDFBox 2.0.9
Date Thu, 26 Apr 2018 05:20:44 GMT
Am 25.04.2018 um 14:20 schrieb Evert-Jan de Bruin:
> Hi,
> 
> Okay, so it seems the document header is incorrect according to the official standards.
Correct

> But this document is not some minor exception, because this is an official document from
the Dutch government proving your educational history. We have tens of thousands of these
documents to process in our system. We can't ask the government to change their PDFs and have
10.000+ students re-request their certificate because PDFbox can't handle the (incorrect)
header :-)
> 
You have to blame the software producing those invalid pdfs. It doesn't make it 
better if the software fails to create thousands of pdfs.

> Readers like Adobe (or others) don't complain either.
That doesn't make the pdfs conform to the spec either. All of them somehow 
repair the broken pdf.

> Isn't there some way to make PDFBox work with these kinds of documents?
PDFBox already has a lot of repair mechanisms in place. For now I have no idea 
how to handle your problematic pdfs

Andreas

> 
> Thanks,
> Evert-Jan de Bruin
> 
> -----Original Message-----
> From: Tilman Hausherr <THausherr@t-online.de>
> Sent: dinsdag 24 april 2018 21:36
> To: users@pdfbox.apache.org
> Subject: Re: Signed PDF with non-encrypted headers causes issue in PDFBox 2.0.9
> 
> I had a quick look... yes the document info is unencrypted which is incorrect. EncryptMeta
is false but this applies only to XMP metadata streams.
> 
> Tilman
> 
> Am 24.04.2018 um 12:48 schrieb Evert-Jan de Bruin:
>>
>> Hello,
>>
>> For my project I have to merge PDF files together. This usually works
>> fine, but it does not always work with digitally signed PDF files.
>>
>> Simply a load() of the document will already fail with
>> InvalidBlockSizeException. Here is an example document:
>> https://ufile.io/mgshz
>>
>> I went into the PDFBox code, and the issue seems to be that it detects
>> AES encryption in the PDF due to the digital signature, but then
>> assumes everything is encrypted and needs to be decrypted. However,
>> the headers are **not** encrypted so decryption fails.
>>
>> I can get it all to work by going to PDFParser.java and disabling
>> these three lines in prepareDecryption():
>>
>> //                securityHandler = encryption.getSecurityHandler();
>>
>> // securityHandler.prepareForDecryption(encryption,
>> document.getDocumentID(),
>>
>> // decryptionMaterial);
>>
>> // accessPermission = securityHandler.getCurrentAccessPermission();
>>
>> However, this is of course very ugly as decryption is now totally
>> disabled. I also get warnings about offset issues but the end result
>> seems fine.
>>
>> Is there a more elegant solution or is this really a bug?
>>
>> It seems to be a repetition of
>> https://issues.apache.org/jira/browse/PDFBOX-3229
>> <https://issues.apache.org/jira/browse/PDFBOX-3229> but this should
>> have been fixed in 2.0.0, however, it still occurs in 2.0.9
>>
>> Regards,
>>
>> Evert-Jan de Bruin
>>
>> K00716_Osiris_MailSignature-logo
>>
>> 	
>>
>> CACI bv
>> www.osiris-ho.nl <http://www.osiris-ho.nl/>
>>
>> 	
>>
>> De Ruyterkade 7
>> 1013 AA Amsterdam
>>
>> 	
>>
>> 088 - 654 3594
>> evert-jan.de.bruin@caci.nl <mailto:evert-jan.de.bruin@caci.nl>
>>
>> This electronic message contains information from CACI BV, which may
>> be confidential, proprietary, privileged or otherwise protected from
>> disclosure. The information is intended to be used solely by the
>> recipient(s) named above. If you are not an intended recipient, be
>> aware that any review, disclosure, copying, distribution or use of
>> this transmission or its contents is prohibited.
>>
>> If you have received this transmission in error, please notify the
>> sender immediately and delete all copies of this message.
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message