pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Goodenough <david.goodeno...@btconnect.com>
Subject Newbie question about parsing PDFs
Date Sun, 25 Sep 2016 10:24:30 GMT
I need to take a PDF document and extract each item of text with its
position on the page.  PDFBox looks to be a good tool to use, but the
examples are mainly to do with building PDFs rather than parsing them
and the API is very rich (for which read large).

Does anyone have any code they would be prepared to share that does
this kind of parsing, or some pointers as to which classes I should
be looking at?

Thank you

David

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message