pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qingchao Kong <kqingc...@gmail.com>
Subject PDF text extraction result different from what they look in PDF reader application
Date Mon, 05 May 2014 10:50:16 GMT
Hi, I am using PDFBox to extract text from PDF files.
I noticed that, for some PDF files(usually old PDFs), when you select
some text using your mouse in the PDF reader application (I use Evince
on Ubuntu), some other text come up, different from the text when you
don't select them.

I find that PDFBox sometimes actually extract the selected text, not
the text when you don't select them. Could anybody tell me why this
happen? Am I understood?

Mime
View raw message