pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuart Coleman <stu...@eduvee.com>
Subject Is PDFBox capable of detecting features Acrobat Reader can highlight
Date Wed, 12 Jun 2013 19:35:19 GMT

I have a PDF file which I am trying to extract text from. Unfortunately the document is non
sequential and has various boxes with supplementary content. When I open the file in Acrobat
Reader, Reader seems to be able to distinguish these features and can surround them with a
blue bounding box. I would like to be able to extract text by area from within these bounding
boxes? Is PDFBox capable of detecting these features also?

I have attached a screenshot showing the style of box I am referring to (top right hand corner)


View raw message