pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hamelmann, Stefan - CC" <hamelm...@lenze.de>
Subject Problem with Adobe PDF collections
Date Mon, 17 Jan 2011 15:56:37 GMT


We are using Solr/Lucene to index pdf documents for search. When it comes to Adobe "pdf collections"
Solr fails to parse it. The error message is: "File can't be handled as pdf"
Example: http://src.lenze.com/lenze-bibliothek/de/C1_Automation/C11_Industrielle_Kommunikation/C112-b_Ethernet/KHB_Ethernet_DE.pdf

Is it a pdfbox issue? Is there a way to fix it?

Best regards

Stefan Hamelmann

View raw message