lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: Multiple index performance
Date Tue, 19 Aug 2008 13:21:00 GMT
Another issue is opening/closing your indexes. When you open an
index for searching, the first few queries you fire invoke considerable
overhead as caches warm up, etc. Plus, you don't get any efficiencies
of scale (that is, pretty soon adding 2X the amount of text to an index
increases the size of the index considerably less than 2X if you're
not storing the text).

So, you either have to keep 10,000 indexes open for efficient searching,
or open/close each one on demand and live with the consequent hit to
your searching performance.

I'd think about keeping it all in a large index, storing the user's name
as a field and appending something like "AND user:cyndy" to each
search. You could also assemble a filter for your user and tack that on
to the query. But the above clause is conceptually simplest.


On Mon, Aug 18, 2008 at 10:34 PM, Cyndy <> wrote:

> Hello, I am new into Lucene and I want to make sure what I am trying to do
> will not hit performance. My scenario is the following:
> I want to keep user text files indexed separately, I will have about 10,000
> users and each user may have about 20,000 short files, and I need to keep
> privacy. So the idea is to have one folder with the text files and  index
> for each user, so when search will be done, it will be pointing to the
> corresponding file directory. Would this approach hit performance? is this
> a
> good solution? Any recommendation?
> Thanks in advance.
> --
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message