pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Read and or Flatten xfa form
Date Sat, 23 Feb 2019 10:35:18 GMT
We don't handle xfa, you're on your own there, or should buy a product 
that can (I think itext can do it).

XFA is some sort of XML. So after you have getDocument() you need to 
look at the XML you get. The XFA specification is 1500 pages long.

If all the documents you want to handle have the same content, then you 
might be able to get what you need without reading it.

Tilman

Am 23.02.2019 um 02:55 schrieb Nick Westerly:
> Hi, my ultimate goal is to extract text data from PDFs forms using xfa. Is
> it possible to use pdfbox to flatten PDFs with xfa forms ( to simplify text
> extraction).
>
> If not can the fields themselves be easily parsed?
>
> I see
> https://stackoverflow.com/questions/14454387/pdfbox-how-to-flatten-a-pdf-form
> which seems to say that xfa is not flatten able?
>
> I see this class,
> https://pdfbox.apache.org/docs/1.8.12/javadocs/org/apache/pdfbox/pdmodel/interactive/form/PDXFA.html,
> once I call getDocument, how can I get fields (by name/type/) and contents?
>
> Thanks!
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message