lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Stein" <>
Subject Re: Lucene as syslog storage
Date Tue, 20 Jun 2006 19:41:48 GMT
> I've personally indexed over 1,000,000 documents and Lucene doesn't even
> breath hard.

We are in the hundreds of millions and growing, and Lucene does tend
to sweat a little bit, although it can certainly handle it.

You're going to have to understand a bit of the internals of Lucene a
bit more.  For example, we've had some serious bottlenecks when it
comes to sorting.  Comments like "sorting with strings takes more
memory" really compounds when you have 4 million search results to

You'll definitely want to use multisearchers and partition your
indexes *intelligently* according to your business logic.  You might
even run into a scenario where you need multiple copies of your index,
each partitioned in a different way depending on the use case.

Finally, be prepared for indexing to take a looong time.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message