pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evan Smith <evan.smith.ms...@gmail.com>
Subject unable to extract text
Date Thu, 11 Feb 2016 18:37:19 GMT

Using pdf box
java -jar pdfbox-app-1.8.11.jar ExtractText 
lenvima-epar-product-information.pdf lenvima-epar-product-information.txt

The text extracted comes out with a "width" of about 15 characters, just 
one big column.  In later pages it seems to figure it out ... and then 
get confused again.

I am able to use pdfbox on other pdfs and works great.  So something 
about this pdf is the issue.

Note, when I copy and paste out of adobe reader I find that I get the 
same column issue.

Ideas on how to get the text here ... with a larger width?

See attached pdf


Evan C. Smith, MS, MD
Cell: 781-879-8736

View raw message