pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Ternes <KTer...@thegeneral.com>
Subject RE: Best way to deal with NULL PDAcroForm fields
Date Wed, 09 Sep 2015 19:09:20 GMT
Thank you!
As I started out saying, I deal with a lot of merkw├╝rdig documents.  I have had a lot of
problems with non-compliant PDFs.
There is no telling how long ago they were created and by whom and with what software.  This
change will help.

-----Original Message-----
From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
Sent: Tuesday, September 08, 2015 12:38 PM
To: users@pdfbox.apache.org
Subject: Re: Best way to deal with NULL PDAcroForm fields

Hi,

> Am 08.09.2015 um 18:38 schrieb Kevin Ternes <KTernes@thegeneral.com>:
> 
> 1.8.10
> 
> -----Original Message-----
> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de]
> Sent: Tuesday, September 08, 2015 11:20 AM
> To: users@pdfbox.apache.org
> Subject: Re: Best way to deal with NULL PDAcroForm fields
> 
> Hi Kevin,
> 
>> Am 08.09.2015 um 16:45 schrieb Kevin Ternes <KTernes@thegeneral.com>:
>> 
>> 
>> I get a lot of weird documents.  When I try to set a particular field value, some
of them throw NullPointerExceptions from line PDAcroForm.getField(), line 291:
>> 
>> 287: COSArray fields =
>> 288:    (COSArray) acroForm.getDictionaryObject(
>> 289:        COSName.getPDFName("Fields"));
>> 290:
>> 291: for (int i = 0; i < fields.size() && retval == null; i++) 292:{
>> 
>> To avoid this, at first I was calling PDAcroForm.getFields() and checking that to
see if that was NULL but I realized that it would usually create a new fields array to return
which seemed wasteful.

this happens when there is a /Fields entry but there is no content in which case an empty
List is returned which you could check using List.isEmpty(). Unfortunately in case the /Fields
entry is missing completely null is returned. This has been addressed in PDFBox 2.0.0 where
there is always an empty List for both cases. Please note that /Fields is a required entry
so the PDF(s) are not in line with the spec but nevertheless should be handled correctly.

>> 
>> Is the most efficient way to avoid this to first call:
>>   COSArray fields =  (COSArray) acroForm.getDictionaryObject( 
>> COSName.getPDFName("Fields")); myself and check if that is NULL?

if there is no /Fields entry getFields() returns null - so you could use that. 

>> 
>> 
>> Secondary Question:
>> The method PDAcroForm.getFields() does a not-NULL check of fields before calling
fields.size().
>> Is there a reason that this check is not performed in getField()?

thats a bug. I've created https://issues.apache.org/jira/browse/PDFBOX-2965 <https://issues.apache.org/jira/browse/PDFBOX-2965>
for that.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message