pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: extracting checkboxes in non acroform pdf
Date Thu, 29 Nov 2018 07:04:51 GMT
It could be an XFA forms pdf... then you'd have to analyze the XML content.

It could be widgets annotations without acroform, then you'd have to 
analyse these.

It could be ordinary text, then the text stripper would do the job.

It could be vector graphics, then it gets really difficult.


Am 28.11.2018 um 23:05 schrieb Nicolas Paris:
> Hi
> I have several pdf created with PDFCreator and I want to extract
> the content as text, including the checkboxes values in it.
> THe pdf looks like a regular form pdf with checkboxes. However it is not
> a acro form based pdf, and the regular pdfbox code I use in this case
> does not apply : the acroform is null !
> I wonder how I can iterate on those checkboxes (or visually equivalent)
> objects or symbols.
> If someone can give me a starter to list all objects in that pdf, that
> might be helpful to begin with.
> Thanks by advance,

To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message