pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maruan Sahyoun <sahy...@fileaffairs.de>
Subject Re: Best way to deal with NULL PDAcroForm fields
Date Wed, 09 Sep 2015 19:12:27 GMT
maybe you can share some of them (upload them to a public location) so we get a better idea
and might provide some feedback

BR
Maruan

> Am 09.09.2015 um 21:09 schrieb Kevin Ternes <KTernes@thegeneral.com>:
> 
> Thank you!
> As I started out saying, I deal with a lot of merkw├╝rdig documents.  I have had a lot
of problems with non-compliant PDFs.
> There is no telling how long ago they were created and by whom and with what software.
 This change will help.
> 
> -----Original Message-----
> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
> Sent: Tuesday, September 08, 2015 12:38 PM
> To: users@pdfbox.apache.org
> Subject: Re: Best way to deal with NULL PDAcroForm fields
> 
> Hi,
> 
>> Am 08.09.2015 um 18:38 schrieb Kevin Ternes <KTernes@thegeneral.com>:
>> 
>> 1.8.10
>> 
>> -----Original Message-----
>> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de]
>> Sent: Tuesday, September 08, 2015 11:20 AM
>> To: users@pdfbox.apache.org
>> Subject: Re: Best way to deal with NULL PDAcroForm fields
>> 
>> Hi Kevin,
>> 
>>> Am 08.09.2015 um 16:45 schrieb Kevin Ternes <KTernes@thegeneral.com>:
>>> 
>>> 
>>> I get a lot of weird documents.  When I try to set a particular field value,
some of them throw NullPointerExceptions from line PDAcroForm.getField(), line 291:
>>> 
>>> 287: COSArray fields =
>>> 288:    (COSArray) acroForm.getDictionaryObject(
>>> 289:        COSName.getPDFName("Fields"));
>>> 290:
>>> 291: for (int i = 0; i < fields.size() && retval == null; i++) 292:{
>>> 
>>> To avoid this, at first I was calling PDAcroForm.getFields() and checking that
to see if that was NULL but I realized that it would usually create a new fields array to
return which seemed wasteful.
> 
> this happens when there is a /Fields entry but there is no content in which case an empty
List is returned which you could check using List.isEmpty(). Unfortunately in case the /Fields
entry is missing completely null is returned. This has been addressed in PDFBox 2.0.0 where
there is always an empty List for both cases. Please note that /Fields is a required entry
so the PDF(s) are not in line with the spec but nevertheless should be handled correctly.
> 
>>> 
>>> Is the most efficient way to avoid this to first call:
>>>  COSArray fields =  (COSArray) acroForm.getDictionaryObject( 
>>> COSName.getPDFName("Fields")); myself and check if that is NULL?
> 
> if there is no /Fields entry getFields() returns null - so you could use that. 
> 
>>> 
>>> 
>>> Secondary Question:
>>> The method PDAcroForm.getFields() does a not-NULL check of fields before calling
fields.size().
>>> Is there a reason that this check is not performed in getField()?
> 
> thats a bug. I've created https://issues.apache.org/jira/browse/PDFBOX-2965 <https://issues.apache.org/jira/browse/PDFBOX-2965>
for that.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message