pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: Make PDFBox fail on bad pdf
Date Thu, 30 Mar 2017 12:16:31 GMT
If you have any recommendations for the more general case, let us know on TIKA-1443 [1].

[1] https://issues.apache.org/jira/browse/TIKA-1443

-----Original Message-----
From: Wouter De Borger [mailto:wouter.deborger@inmanta.com] 
Sent: Thursday, March 30, 2017 6:00 AM
To: users@pdfbox.apache.org
Subject: Make PDFBox fail on bad pdf

Hi All,

When a pdf has bad encoding, PDFBox produces garbage (as explained in the FAQ https://pdfbox.apache.org/2.0/faq.html#gibberish).

Can I make PDFBox fail in this case instead of producing garbage?

(I'm working on a system that can also do OCR, so at the least sign of trouble, I would like
PDF box to fail and try OCR.)

View raw message