pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Accessing check-boxes in a non-acro form PDF
Date Thu, 10 May 2018 15:30:18 GMT
Am 10.05.2018 um 16:26 schrieb Ankit Inkollu:
>   Hi All,
>
> *Scenario:*
> I need to verify if the check-box for a certain field in a non-acro form
> PDF is ticked or not.
>
> *Options tried:*
> 1. I tried to search for any class in PDFBOX which points to the check-box
> but could not find any.

There isn't if it isn't acroform nor xfa. A box is just a box, i.e. a 
shape somewhere. (Unless the character for a checked box is used)

> 2. Tried using the co-ordinates of the check-box and create an image and
> then compare it against an already stored image of a check-box but this is
> quite cumbersome and fails for few PDFs.



>
>
> Is there a way in PDFBox which can implement the above mentioned scenario.
> If this does not work out, is there an OCR API in JAVA which will help.

Tesseract has a java interface. But not PDFBox. Tika has an OCR option 
and it will use tesseract.

Tilman


>
> Do let me know if any of you have faced such a situation.
>
> Thanks
> Ankit
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message