lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Lucene demo ideas?
Date Wed, 17 Sep 2003 15:10:17 GMT
Erik Hatcher wrote:
> On Wednesday, September 17, 2003, at 08:42  AM, Ben Litchfield wrote:
> 
>> What, no PDF files!!
> 
> 
> Haha!
> 
>> http://www.pdfbox.org
> 
> 
> And I've used pdfbox before - its cool.
> 
> And I'm cool with adding PDF and Word indexing to the demo personally, 
> but I didn't want to increase the "weight" of the demo application.  If 
> folks feel strongly about it then I'll incorporate it.

A word of warning: PDFBox is fantastic, I agree - but some PDFs are not 
so... In my application I experienced numerous hangs when PDFBox would 
start parsing some PDFs (I can send the files to Ben if required), and 
then got stuck in an infinite wait somewhere... So I came up with a 
workaround: I run the parser in a separate thread, while waiting in the 
main thread, and then after a certain timeout I kill the processing 
thread and return.

-- 
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)




Mime
View raw message