pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Goodenough <david.goodeno...@btconnect.com>
Subject Re: Newbie question about parsing PDFs
Date Sun, 25 Sep 2016 16:16:00 GMT
On Sunday, 25 September 2016 12:31:04 BST Tilman Hausherr wrote:
> Am 25.09.2016 um 12:24 schrieb David Goodenough:
> > I need to take a PDF document and extract each item of text with its
> > position on the page.  PDFBox looks to be a good tool to use, but the
> > examples are mainly to do with building PDFs rather than parsing them
> > and the API is very rich (for which read large).
> > 
> > Does anyone have any code they would be prepared to share that does
> > this kind of parsing, or some pointers as to which classes I should
> > be looking at?
> 
> Have a look at PrintTextLocations.java in the source download.
> 
> Tilman
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
Wonderful, looks like exactly what I was looking for.

David


Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message