pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cary L. Schofield" <cary.schofi...@eesoh.com>
Subject Re: PDFParser Conflict Resolution
Date Mon, 24 Feb 2014 22:14:49 GMT
Thanks for you reply.  I have followed your recommendation.  There was a 
TODO in the NonSequentialParser that indicated that signature contents 
are not encrypt and thus should not be decrypted.  I have added code to 
not decrypt in this case and my documents seem to parsed correctly.

Thanks again.


On 02/22/2014 09:23 AM, Maruan Sahyoun wrote:
> Hi,
>
> the PDFParser works sequentially throughout the file from top to bottom and collects
all objects. Conflict resolution is done by making the assumption that if an object with the
same number exists later in the file that this should be the correct one.
>
> NonSequentialParser works through the file by looking at the Xref information (table
or stream). This is inline with the PDF specification.
>
> So patching as you’ve done might resolve your issue but might also introduce issues
with other files. The best way would be to find out why NonSequentialParser has issues parsing
your file. If you think it’s a bug please open an issue in jira [https://issues.apache.org/jira/browse/PDFBOX]
and attach the PDF file to together with some sample code.
>
> BR
> Maruan Sahyoun
>
> Am 21.02.2014 um 23:47 schrieb Cary L. Schofield <cary.schofield@eesoh.com>:
>
>> I have a signed document that is getting parsed incorrectly.
>>
>> Using PDFParser the document form is missing all fields and I can't get to the signature
fields.
>> Using NonSequentialPDFParser I can get to the signature fields but the signed data
appears to have been corrupted.
>>
>> I was able to determine that the form was being replaced or corrupted during conflict
resolution.
>>
>> I solved the problem by patching PDFParser.ConflictObj to ignore an object in the
conflict list when the existing object (from the object pool) is a direct object.
>>
>> I know I should do the research, but was hoping someone would already know if the
patch is reasonable or likely to cause more/other problems.
>>
>> Thanks
>>
>


Mime
View raw message