pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Scanned PDFs
Date Mon, 13 Jun 2016 18:53:44 GMT
Am 13.06.2016 um 20:49 schrieb Al Grant:
> Morning
> Is it possible to extract text from a scanned pdf using pdfbox?

If the PDF has invisible OCRed text, yes. If not, then you'd need to OCR 
it. TIKA is working on something.


To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message