pdfbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Allison <talli...@apache.org>
Subject Comparing extracted text with pdftotext
Date Mon, 26 Nov 2018 20:49:55 GMT

  I just finished drafting a high level "lab report" comparing
pdftotext and Tika/PDFBox on the PDFs in our refreshed regression
corpus: https://wiki.apache.org/tika/ComparisonTikaAndPDFToText201811.
The more interesting bits are in the actual reports from tika-eval
and/or the comparison database available here:

  Let me know what you think.



To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org

View raw message