lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikhil Chhaochharia <nikhil...@yahoo.com>
Subject Re: Multiple Indices vs Single Index
Date Thu, 20 Sep 2007 16:07:37 GMT
OK, thanks.

I actually have both systems implemented. The multi-index one is being used currently and
it works well.  I have deployed the single index solution a few times during off-peak hours
and the response time has been almost the same as the multi-index solution.  I tried to simulate
some load but again my numbers were mostly similar for both cases.

I have already done all the suggested optimizations since I first ran into problems a few
months ago.  The performance had improved considerably.  Since then, my traffic has increased
and I have again started facing some issues during peak-load hours.

I guess I should get another box and run proper tests there.  Will run a profiler also.

Thanks for all the suggestions.

Regards,
Nikhil


----- Original Message ----
From: Grant Ingersoll <gsingers@apache.org>
To: java-user@lucene.apache.org
Sent: Thursday, 20 September, 2007 9:25:01 PM
Subject: Re: Multiple Indices vs Single Index

OK, I thought you meant your index would have in it the name of the  
second index and would thus do a two-stage retrieval.

At any rate, if you are saying your combined index with all the  
stored fields is ~3.4 GB I would think it would fit reasonably on the  
machine you have and perform reasonably.  Naturally, this depends on  
your application, your users, etc. and I can't make any guarantees,  
but I certainly recall others managing this size just fine.  See the  
many tips on improving searching and indexing on the Wiki (link at  
bottom in my signature) and do some profiling/testing.

When you said your tests were inconclusive, what tests have you  
done?  If you can, run the tests in a profiler to see where your  
bottlenecks are.

-Grant


On Sep 20, 2007, at 11:16 AM, Nikhil Chhaochharia wrote:

> I am sorry, it seems that I was not clear with what my problem is.   
> I will try to describe it again.
>
> My data is divided into 40 categories and at one time only one  
> category can be searched.  The GUI for the system will ask the user  
> to select the category from a drop-down.  Currently, I have a  
> separate index for every category.  The index sizes varies - one  
> category index is 10MB and another is 700MB.  Other index-sizes are  
> somewhere in between.
>
> I was wondering if it will be better to just have 1 large index  
> with all the 40 indices combined.  I do not need to do dual-queries  
> and my total index size (if I create a single index) is about  
> 3.4GB.  It will increase to maximum of 5-6 GB.  I am running this  
> on a dedicated machine with 8GB RAM.
>
> Unfortunately I do not have enough hardware to run both in parallel  
> and test properly.  Have just one server which is being used by  
> live users.  So it would be great if you could tell me whether I  
> should stick with my 40 indices or combine them into 1 index.  What  
> are the pros and cons of each approach ?
>
> Thanks,
> Nikhil
>
>
> ----- Original Message ----
> From: Grant Ingersoll <gsingers@apache.org>
> To: java-user@lucene.apache.org
> Sent: Thursday, 20 September, 2007 7:57:21 PM
> Subject: Re: Multiple Indices vs Single Index
>
> If I understand correctly, you want to do a two stage retrieval
> right?  That is, look up in the initial index (3.4 GB) and then do a
> second search on the sub index?  Presumably, you have to manage the
> Searchers, etc. for each of the sub-indexes as well as the big
> index.  This means you have to go through the hits from the first
> search, then route, etc. correct?
>
> Have you tried creating one single index with all the (stored)
> fields, etc?  Worst case scenario, assuming 1GB per index, is you
> would have a 40GB index, but my guess is index compression will
> reduce it more.  Since you are less than that anyway, have you tried
> just the straightforward solution?  Or do you have other requirements
> that force the sub-index solution?  Also, I am not sure it will work,
> but it seems worth a try.  Of course, this also depends on how much
> you expect your indexes to grow.
>
> Also, what was inconclusive about your tests?  Maybe you can describe
> more what you have tried to date?
>
> Cheers,
> Grant
>
> On Sep 20, 2007, at 3:50 AM, Nikhil Chhaochharia wrote:
>
>> Hi,
>>
>> I have about 40 indices which range in size from 10MB to 700MB.
>> There are quite a few stored fields.  To get an idea of the
>> document size, I have about 400k documents in the 700MB index.
>>
>> Depending on the query, I choose the index which needs to be
>> searched.  Each query hits only one index.  I was wondering if
>> creating a single index where every document will have the
>> indexname as a field will be more efficient.  I created such an
>> index and it was 3.4 GB in size.  My initial performance tests with
>> it are not conclusive.
>>
>> Also, what are the other points to be addressed while deciding
>> between 1 index and 40 indices.
>>
>> I have 8GB RAM on the machine.
>>
>>
>> Thanks,
>> Nikhil
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message