Return-Path: X-Original-To: apmail-pdfbox-users-archive@www.apache.org Delivered-To: apmail-pdfbox-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 58C0210944 for ; Sat, 22 Feb 2014 16:23:46 +0000 (UTC) Received: (qmail 84960 invoked by uid 500); 22 Feb 2014 16:23:46 -0000 Delivered-To: apmail-pdfbox-users-archive@pdfbox.apache.org Received: (qmail 84919 invoked by uid 500); 22 Feb 2014 16:23:45 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 84911 invoked by uid 99); 22 Feb 2014 16:23:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 22 Feb 2014 16:23:45 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [213.133.104.168] (HELO www168.your-server.de) (213.133.104.168) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 22 Feb 2014 16:23:37 +0000 Received: from [88.198.220.132] (helo=sslproxy03.your-server.de) by www168.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-SHA256:256) (Exim 4.80.1) (envelope-from ) id 1WHFMC-0007xL-Px for users@pdfbox.apache.org; Sat, 22 Feb 2014 17:23:16 +0100 Received: from [79.242.125.66] (helo=mbp001.fritz.box) by sslproxy03.your-server.de with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1WHFM9-0006ts-Br for users@pdfbox.apache.org; Sat, 22 Feb 2014 17:23:13 +0100 From: Maruan Sahyoun Content-Type: multipart/alternative; boundary="Apple-Mail=_1ECE705C-81B7-4A07-80FB-DD09059E5ED3" Message-Id: <7BC71DBD-FE05-44D8-9EC5-C720750326CC@fileaffairs.de> Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: PDFParser Conflict Resolution Date: Sat, 22 Feb 2014 17:23:09 +0100 References: <5307D765.3000004@eesoh.com> To: users@pdfbox.apache.org In-Reply-To: <5307D765.3000004@eesoh.com> X-Mailer: Apple Mail (2.1827) X-Authenticated-Sender: sahyoun@fileaffairs.de X-Virus-Scanned: Clear (ClamAV 0.97.8/18505/Sat Feb 22 10:26:43 2014) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_1ECE705C-81B7-4A07-80FB-DD09059E5ED3 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 Hi, the PDFParser works sequentially throughout the file from top to bottom = and collects all objects. Conflict resolution is done by making the = assumption that if an object with the same number exists later in the = file that this should be the correct one. NonSequentialParser works through the file by looking at the Xref = information (table or stream). This is inline with the PDF = specification. So patching as you=92ve done might resolve your issue but might also = introduce issues with other files. The best way would be to find out why = NonSequentialParser has issues parsing your file. If you think it=92s a = bug please open an issue in jira = [https://issues.apache.org/jira/browse/PDFBOX] and attach the PDF file = to together with some sample code. BR Maruan Sahyoun Am 21.02.2014 um 23:47 schrieb Cary L. Schofield = : > I have a signed document that is getting parsed incorrectly. >=20 > Using PDFParser the document form is missing all fields and I can't = get to the signature fields. > Using NonSequentialPDFParser I can get to the signature fields but the = signed data appears to have been corrupted. >=20 > I was able to determine that the form was being replaced or corrupted = during conflict resolution. >=20 > I solved the problem by patching PDFParser.ConflictObj to ignore an = object in the conflict list when the existing object (from the object = pool) is a direct object. >=20 > I know I should do the research, but was hoping someone would already = know if the patch is reasonable or likely to cause more/other problems. >=20 > Thanks >=20 --Apple-Mail=_1ECE705C-81B7-4A07-80FB-DD09059E5ED3--