lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Is lucene right for us
Date Sun, 12 Oct 2008 14:53:51 GMT
Lucene should work quite well for this, you'll just need some  
infrastructure around it to get the file and extract the contents (see  
Lucene's Tika project).  And, yes, Lucene is thread-safe, so you can  
index safely as you describe.

On Oct 11, 2008, at 10:22 AM, Mag Gam wrote:

> Hello All,
> At my university we have over 20,000 small file ranging from 20k to
> 500k per directory and we would like to index them. I was wondering if
> Lucene is the right tool for this? The information we would like to
> keep is: filename, filesize, filedate, filecontent. Also, is it
> possible to run the initial index in multithreaded mode since we are
> talking about many directories with similar contents?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.

Lucene Helpful Hints:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message