pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject fwd: A Benchmark and Evaluation for Text Extraction from PDF
Date Sat, 15 Jul 2017 11:22:02 GMT
http://ad-publications.informatik.uni-freiburg.de/benchmark.pdf

A Benchmark and Evaluation for Text Extraction from PDF

PDFBox is the best in 4 categories, the worst in one (missing newlines), 
and near the top in one (lack of errors). I have asked the authors to 
name me some of the files re: missing newlines, and the two error files.

Tilman



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message