pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vomlel Jan <Jan.Vom...@aipsafe.cz>
Subject RE: problem with pdf eof
Date Thu, 16 Oct 2014 13:44:21 GMT
When I use load insted of loadNoSeq, signatures are in this case  valid.

But for some documents load function doesnot read complete document. That is why I used loadNoSeq.
Some signatures are then missing.

h1.pdf - original file (signature and timestamp)
h2.pdf - add first signature by pdfbox (timestamp is missing)
h3.pdf - add second signature by pdfbox (timestamp and previous signature is missing)

-----Original Message-----
From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
Sent: Thursday, October 16, 2014 2:37 PM
To: users@pdfbox.apache.org
Subject: Re: problem with pdf eof

when signing please make sure that you load the pdf using PDDocument.load instead of PDDocument.loadNonSeq.

Am 16.10.2014 um 11:57 schrieb Vomlel Jan <Jan.Vomlel@aipsafe.cz>:

> -----Original Message-----
> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
> Sent: Thursday, October 16, 2014 11:55 AM
> To: users@pdfbox.apache.org
> Subject: Re: problem with pdf eof
> when you say invalid do you mean it’s corrupted or e.g. you get a warning sign in Adobe
Reader? Would you have a sample PDF?
> When you sign a document and sign it again the first signature points to a different
document revision as you have changed the documents content afterwards. So invalid in that
context could mean that the warning you might be getting is only reflecting that fact. Would
need to see the document to  understand what’s going on.
> BR
> Maruan
> Am 16.10.2014 um 11:48 schrieb Vomlel Jan <Jan.Vomlel@aipsafe.cz>:
>> Hi Maruan and others,
>> I created signature and it seems OK. 
>> But when I create second signature (loadNonSeq, addSignature, saveIncremental again),
the first signature becomes invalid. 
>> I think that there can be problem, that first page is updated (signatur is invisible),
but I dont understand it enough. 
>> Jan
>> -----Original Message-----
>> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
>> Sent: Monday, October 13, 2014 4:09 PM
>> To: users@pdfbox.apache.org
>> Subject: Re: problem with pdf eof
>> Hi Jan,
>> there are sample in the examples package for various ways to sign a document [1].
Signing a document needs incremental saving.
>> OTOH choosing the right solution should not be made on the base if there is a license
fee or not. 
>> Maruan Sahyoun
>> [1] http://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/signature/
>> Am 13.10.2014 um 16:02 schrieb Vomlel Jan <Jan.Vomlel@aipsafe.cz>:
>>> Hi Maruan (and others),
>>> I would like to use pdfbox and bouncycastle for managing pdf signatures. Parsing,
validation, timestamping (PADES LTV) . 
>>> We used itext for it, but it is under commercial licence.
>>> Parsing signatures seems to be working (thanks to your advice). So I will try
to create timestamp. 
>>> Is it possible with pdfbox?  I found save method on PDDocument, but Iˇm afraid,
that it can change bite representation of pdf, and signatures become invalid. Is it true?
What is right way to create signature or timestamp with pdfbox?
>>> Jan
>>> -----Original Message-----
>>> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
>>> Sent: Friday, October 10, 2014 10:44 AM
>>> To: users@pdfbox.apache.org
>>> Subject: Re: problem with pdf eof
>>> Hi Jan,
>>> choosing the right technology is very important so I do understand your concerns.
I had to make such decision about using PDFBox in the past too. 
>>> It can 
>>> If you have specific issues I can answer I’m happy to try to do so. As a general
statement PDFBox is used in production environments today (as an example we ourselves are
using it for a banking customer to process account statements, an airline company to preprocess
archiving documents and various other customers). 
>>> PDFBox is continuously enhancing the parsing as we try to deal with real world
PDF files which are not always inline with the the PDF specification. Currently the best approach
is to use PDDocument.loadNonSeq (which parses documents according to the Xref information)
and in case of an exception PDDocument.load (which parses sequentially). The Apache Tika project,
which uses PDFBox for parsing PDF’s, is running the parsing and text extraction against
50k PDFs being made available via http://digitalcorpora.org
>>> What is the application you would like to be using PDFBox for? Text Extraction,
image conversion …. - I might be able to give you more specific information for your use
>>> BR
>>> Maruan
>>> Am 10.10.2014 um 10:10 schrieb Vomlel Jan <Jan.Vomlel@aipsafe.cz>:
>>>> Thank you Maruan, this function loads document.
>>>> I have read https://pdfbox.apache.org/ideas.html "Replace/Enhance PDF parsing".
I think correct parsing is very important, and I have some doubts, if I can use pdfbox in
production. Can you say something to rest me :-).
>>>> Jan
>>>> -----Original Message-----
>>>> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
>>>> Sent: Friday, October 10, 2014 9:25 AM
>>>> To: users@pdfbox.apache.or
>>>> Subject: Re: problem with pdf eof
>>>> Hi 
>>>> you can try PDDocument.loadNonSeq(InputStream is, null) 
>>>> BR
>>>> Maruan
>>>> Am 10.10.2014 um 09:09 schrieb Vomlel Jan <Jan.Vomlel@aipsafe.cz>:
>>>>> Hello,
>>>>> I use PDFBox 1.8.7  PDDocument.load(InputStream is) method to parse PDF
document in attachement.
>>>>> Method return without exception, but document model is incomplete.
>>>>> Problem is in characters after EOF (ofset 22939):
>>>>> startxref
>>>>> 22449
>>>>> %%EOF
>>>>> @
>>>>> 16 0 obj
>>>>> << 
>>>>> /Type /Catalog
>>>>> PDFBox create internal IOException and ignore it with comment:
>>>>>                 /*
>>>>>                  * PDF files may have random data after the EOF marker.
Ignore errors if
>>>>>                  * last object processed is EOF.
>>>>>                  */
>>>>> Is this PDF construction valid?
>>>>> Which parser in PDFBox is correct? I tried ConformingPDParser, but another
error occured.
>>>>> Jan
>>>>> Tento e-mail ani žádný z připojených souborů nejsou přijetím
návrhu na uzavření smlouvy, ledaže je to v nich výslovně uvedeno. Pokud tomu tak není,
nelze je považovat za jednání, které by zakládalo jakékoliv nároky vůči společnosti
AiP Safe. Tento e-mail je určen pouze uvedenému příjemci a dalším osobám, které jsou
jmenovitě uvedeny jako příjemci, a jeho obsah, včetně obsahu všech připojených souborů,
je důvěrný. Jestliže nejste oprávněný příjemce, zdržte se, prosím, jakékoliv formy
zveřejnění, reprodukce, kopírování, distribuce nebo šíření jeho obsahu, včetně
obsahu všech připojených souborů. Pokud jste obdržel tento e-mail omylem, oznamte to,
prosím, neprodleně jeho odesilateli a e-mail, včetně všech připojených souborů, vymažte.
Všechny e maily adresované, přijímané nebo posílané AiP Safe s.r.o. nebo zaměstnanci
AiP Safe s.r.o. jsou považovány za zásadně pracovní e-maily. V souladu s tím odesilatel
nebo příjemce těchto e mailů souhlasí, že mohou být čteny jinými zaměstnanci AiP
Safe s.r.o., než je daný příjemce nebo odesilatel, proto aby byla zajištěna kontinuita
pracovních aktivit a byla umožněna jejich kontrola..

View raw message