lucene-lucene-net-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavlo Zahozhenko <>
Subject Re: Designing an index with constant speed no matter how big
Date Sat, 02 May 2009 22:10:02 GMT
The simplest solution as I see it is "sharding" your index, for example,
creating index for all users, whose email address starts with "a" letter,
another index for users with email addresses, starting with "b" letter etc
(you may associate a few rare latters with a single index to make your index
list more or less evenly distributed). Then, when user performes search, you
will not search the whole MultiIndex, but only the index, containing this
user's emails.
If such sharding is not enough, you may partition your index using different
criterium, e.g. 1000 users per index, ordered by ID.

As far as I'm concearned, this is a common practice for large indices.


2009/5/2 Pierre Henri Kuaté <>

> Hi,
> I am working on a project where full-text search gets slower as the number
> of (group of) documents increases.
> Here is a simplified description of the project: It is an email system, so
> each user has its emails and can search for them using
> So logically, it should be possible to implement it so that its performance
> doesn't (really) drop as the number of users increases. The speed of a
> search should be based on the amount of documents that the logged user has.
> My current implementation is to have a property OwnerId in each document
> and use it as a clause in the searches. Eg: OwnerId:123 AND
> MailContent:Something
> However, this doesn't work...
> The extreme solution would be to completely dissociate each user's index.
> But that would make my implementation harder to maintain.
> Do you have any suggestions?
> Pierre Henri.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message