pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tolen Miller <tolenmil...@gmail.com>
Subject Re: Cannot load pre existing PDF to access fields
Date Wed, 26 Aug 2015 12:03:40 GMT
Ah. Thanks so much. I didn't realize I had to traverse the structure. My
apologies.

@Roberto Thank you for the sample code!

On Wed, Aug 26, 2015, 02:16 Roberto Nibali <rnibali@gmail.com> wrote:

> Hi
>
> On Wed, Aug 26, 2015 at 9:27 AM, Maruan Sahyoun <sahyoun@fileaffairs.de>
> wrote:
>
> > Hi,
> >
> > > Am 26.08.2015 um 06:00 schrieb Tolen Miller <tolenmiller@msn.com>:
> > >
> > > I uploaded my PDF again, if someone wants to see if they can get all of
> > the
> > > fields to return: http://1drv.ms/1PRKZsI
> > >
> > > After looking at the sample provided by Maruan, I noticed that I was
> not
> > > passing in a File object, when calling the PDDocument.load() method.
> > Doing
> > > so, I now get the same result from Maruan's code (in eclipse).
> > >
> > > Now I am unsure how to get *all* of the fields from the PDAcroForm.  I
> am
> > > trying to get a collection of the fields, so I can loop through them.
> > When
> > > I add this code:
> > >
> > > List<PDField> pdfFields = form.getFields();
> > > for (PDField field : pdfFields) {
> > > System.out.println("PDF Field Full Name: ".concat(field
> > > .getFullyQualifiedName()));
> > > }
> > >
> >
> > as there is only one 'root' field you have to get it's kids and process
> > the field tree down. Take a look at
> > org.apache.pdfbox.examples.fdf.PrintFields of how to do that.
> >
> >
> Having spent the last two months intensively with form fields, here is my
> current code to dump the fields:
>
> private void executeDumpFields(String srcDocName) throws IOException {
>     PDDocument srcDoc = null;
>     try {
>         srcDoc = PDDocument.load(new File(srcDocName));
>
> srcDoc.getDocumentCatalog().getAcroForm().getFields().forEach(this::dumpField);
>         srcDoc.close();
>     } catch (Exception e) {
>         logerr(e.getMessage());
>     } finally {
>         if (srcDoc != null) {
>             srcDoc.close();
>         }
>     }
> }
>
> private void dumpField(PDField srcField) {
>     if (srcField instanceof PDNonTerminalField) {
>         ((PDNonTerminalField)
> srcField).getChildren().forEach(this::dumpField);
>     } else if (!(srcField instanceof PDSignatureField)) {
>         System.out.printf("fqName=%s type=%s%n",
> srcField.getFullyQualifiedName(),
> srcField.getClass().getSimpleName());
>     }
> }
>
> Maybe you can use some of it. Just call the executeDumpFields(...) with the
> appropriate PDF name as a string and go from there. Not understanding the
> PDF standard and how the dictionary trees are built up inside PDF, I had a
> hard time initially understanding why I need to kind of recursively to
> through the PDField entries.
>
> Cheers
> Roberto
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message