lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Multiple Indices vs Single Index
Date Thu, 20 Sep 2007 16:50:46 GMT
If the current version is working well, what is the reason to move?   
Is it just to make management of the indices easier?

On Sep 20, 2007, at 12:07 PM, Nikhil Chhaochharia wrote:

> OK, thanks.
>
> I actually have both systems implemented. The multi-index one is  
> being used currently and it works well.  I have deployed the single  
> index solution a few times during off-peak hours and the response  
> time has been almost the same as the multi-index solution.  I tried  
> to simulate some load but again my numbers were mostly similar for  
> both cases.
>
> I have already done all the suggested optimizations since I first  
> ran into problems a few months ago.  The performance had improved  
> considerably.  Since then, my traffic has increased and I have  
> again started facing some issues during peak-load hours.
>
> I guess I should get another box and run proper tests there.  Will  
> run a profiler also.
>
> Thanks for all the suggestions.
>
> Regards,
> Nikhil
>
>
> ----- Original Message ----
> From: Grant Ingersoll <gsingers@apache.org>
> To: java-user@lucene.apache.org
> Sent: Thursday, 20 September, 2007 9:25:01 PM
> Subject: Re: Multiple Indices vs Single Index
>
> OK, I thought you meant your index would have in it the name of the
> second index and would thus do a two-stage retrieval.
>
> At any rate, if you are saying your combined index with all the
> stored fields is ~3.4 GB I would think it would fit reasonably on the
> machine you have and perform reasonably.  Naturally, this depends on
> your application, your users, etc. and I can't make any guarantees,
> but I certainly recall others managing this size just fine.  See the
> many tips on improving searching and indexing on the Wiki (link at
> bottom in my signature) and do some profiling/testing.
>
> When you said your tests were inconclusive, what tests have you
> done?  If you can, run the tests in a profiler to see where your
> bottlenecks are.
>
> -Grant
>
>
> On Sep 20, 2007, at 11:16 AM, Nikhil Chhaochharia wrote:
>
>> I am sorry, it seems that I was not clear with what my problem is.
>> I will try to describe it again.
>>
>> My data is divided into 40 categories and at one time only one
>> category can be searched.  The GUI for the system will ask the user
>> to select the category from a drop-down.  Currently, I have a
>> separate index for every category.  The index sizes varies - one
>> category index is 10MB and another is 700MB.  Other index-sizes are
>> somewhere in between.
>>
>> I was wondering if it will be better to just have 1 large index
>> with all the 40 indices combined.  I do not need to do dual-queries
>> and my total index size (if I create a single index) is about
>> 3.4GB.  It will increase to maximum of 5-6 GB.  I am running this
>> on a dedicated machine with 8GB RAM.
>>
>> Unfortunately I do not have enough hardware to run both in parallel
>> and test properly.  Have just one server which is being used by
>> live users.  So it would be great if you could tell me whether I
>> should stick with my 40 indices or combine them into 1 index.  What
>> are the pros and cons of each approach ?
>>
>> Thanks,
>> Nikhil
>>
>>
>> ----- Original Message ----
>> From: Grant Ingersoll <gsingers@apache.org>
>> To: java-user@lucene.apache.org
>> Sent: Thursday, 20 September, 2007 7:57:21 PM
>> Subject: Re: Multiple Indices vs Single Index
>>
>> If I understand correctly, you want to do a two stage retrieval
>> right?  That is, look up in the initial index (3.4 GB) and then do a
>> second search on the sub index?  Presumably, you have to manage the
>> Searchers, etc. for each of the sub-indexes as well as the big
>> index.  This means you have to go through the hits from the first
>> search, then route, etc. correct?
>>
>> Have you tried creating one single index with all the (stored)
>> fields, etc?  Worst case scenario, assuming 1GB per index, is you
>> would have a 40GB index, but my guess is index compression will
>> reduce it more.  Since you are less than that anyway, have you tried
>> just the straightforward solution?  Or do you have other requirements
>> that force the sub-index solution?  Also, I am not sure it will work,
>> but it seems worth a try.  Of course, this also depends on how much
>> you expect your indexes to grow.
>>
>> Also, what was inconclusive about your tests?  Maybe you can describe
>> more what you have tried to date?
>>
>> Cheers,
>> Grant
>>
>> On Sep 20, 2007, at 3:50 AM, Nikhil Chhaochharia wrote:
>>
>>> Hi,
>>>
>>> I have about 40 indices which range in size from 10MB to 700MB.
>>> There are quite a few stored fields.  To get an idea of the
>>> document size, I have about 400k documents in the 700MB index.
>>>
>>> Depending on the query, I choose the index which needs to be
>>> searched.  Each query hits only one index.  I was wondering if
>>> creating a single index where every document will have the
>>> indexname as a field will be more efficient.  I created such an
>>> index and it was 3.4 GB in size.  My initial performance tests with
>>> it are not conclusive.
>>>
>>> Also, what are the other points to be addressed while deciding
>>> between 1 index and 40 indices.
>>>
>>> I have 8GB RAM on the machine.
>>>
>>>
>>> Thanks,
>>> Nikhil
>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> --------------------------
>> Grant Ingersoll
>> http://lucene.grantingersoll.com
>>
>> Lucene Helpful Hints:
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message