lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lu" <>
Subject Re: 100,000 indexes and what to do
Date Sat, 11 Mar 2006 17:10:17 GMT
I think it's best to have one small index for each customer, and one
large index for company's index.

Merging customers' contents with the main index will cost a lot of
resources, slowing down systems, while actually not necessary. If
indexing is done by batch job, there'll be a delay between content
updated time and index refreshed time. This maybe acceptable for some
cases, but usually for users' own content, they want to search it
right away.

With small individual customer index, indexing won't cost any time for
10~20 small documents. Customers can search their content right after
content is updated.

Chris Lu
Full-Text Search on Any Databases

On 3/10/06, Lawrence <> wrote:
> Hi all,
> I was reading one of the posting on concurrency and I reread section 9.1 in Lucene in
Action which lead me to this question. I have 100,000 customers and I want to provide them
with personal searching for their documents and sometimes to include company documents in
that search.
> 1.      100,000 customers with 10-20 small document each.
> 2.      Company 5,000 documents, specification, papers, research, etc.
> 3.      Customers can search their own documents and company document.
> P1: Do I provide an index for each customer and allow them multiple index searching,
into company document when they need it?
> OR
> P2: Do I provide one large index for all my 100,000 customers, adding a field for customer
ID so searching can be constrained, so they won't/can't search across other customer's documents,
and then categorize company documents so customers can do multiple index searches into company
> After writing this out I realize that P2 is probably the wiser choice, less complicated,
but I would like to hear from other Luceners.
> Lucene in Action is one of the best written books in my library of ~300 CS books. It
ranks in completeness and clarity up there with works by David Geary, Martin Fowler, and other
Hatcher greats like Java Development with Ant.
> Thanks Otis and Erik.
> Regards, Lawrence
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message