lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Itamar Syn-Hershko <ita...@code972.com>
Subject Re: multiple small indexes or one big index?
Date Fri, 10 Jun 2011 14:47:27 GMT
Erick,


Sorry about reopening this more than a week late...


You were asking about the size of each index; at what index size would 
you consider splitting to several indices with multiple searches etc, 
for what reasons, and does it matter which Lucene version is used?


Thanks :)


Itamar.


On 02/06/2011 16:48, Alexander Rosemann wrote:

> No worries, I'll keep that in mind now.
> In addition I am going to switch to another collector as well. ATM I 
> collect the results and then sort them using the std. Collections.sort 
> approach... I have to look what Lucene offers and switch to something 
> else.
>
> Thanks,
> Alex
>
> On 02.06.2011 15:36, Erick Erickson wrote:
>> Sounds good, just be sure to keep your (now single) searcher open! Also,
>> be sure to measure queries after a while. The first few queries will 
>> fill up
>> caches etc, so the time should improve after the first few.
>>
>> Best
>> Erick
>>
>> On Thu, Jun 2, 2011 at 9:28 AM, Alexander Rosemann
>> <alexander.rosemann@gmail.com>  wrote:
>>> Hi Erick, caching the IndexSearchers didn't took too much effort and
>>> decreased searching already by 30%!
>>>
>>> I am busy changing the code to use a single index as you suggested atm.
>>> Still a few things left to be done but once I have it working I let 
>>> you know
>>> how much faster it is for me.
>>>
>>> Thanks,
>>> Alex
>>>
>>> On 02.06.2011 13:04, Erick Erickson wrote:
>>>>
>>>> At this size, really consider going to a single index. The lack of
>>>> administrative headaches alone is probably well worth the effort....
>>>>
>>>> I almost guarantee that the time you spend re-writing things to keep
>>>> the searchers open (and finding the bugs!) will be far more than just
>>>> putting all the data in a single index.
>>>>
>>>> But that might just be my preferences showing....
>>>>
>>>> Best
>>>> Erick
>>>>
>>>> On Wed, Jun 1, 2011 at 5:37 PM, Alexander Rosemann
>>>> <alexander.rosemann@gmail.com>    wrote:
>>>>>
>>>>> Many thanks for the tips, Erick! I do close each searcher after a
>>>>> search...
>>>>> I will change that first thing tmrw. and let you know how that went.
>>>>> Multi-threaded searching will be next and if that hasn't helped, I 
>>>>> will
>>>>> switch to one big index.
>>>>> All indexes together are rather small, ~200MB and 50.000 documents.
>>>>>
>>>>> -Alex
>>>>>
>>>>> On 01.06.2011 23:26, Erick Erickson wrote:
>>>>>>
>>>>>> I'd start by putting them all in one index. There's no penalty
>>>>>> in Lucene for having empty fields in a document, unlike an
>>>>>> RDBMS.
>>>>>>
>>>>>> Alternately, if you're opening then closing searchers each
>>>>>> time, that's very expensive. Could you open the searchers
>>>>>> once and keep them open (all 90 of them)? That alone might
>>>>>> do the trick and be less of a change to your program. You
>>>>>> could also fire multiple threads at the searches, but check if
>>>>>> you're CPU bound first (if you are, multiple threads won't
>>>>>> help much/at all).
>>>>>>
>>>>>> You haven't said how big these indexes are nor how many
>>>>>> documents you're talking about here, so this advice is suspect.
>>>>>>
>>>>>> Do look at putting it all in one index though, let us know if you
>>>>>> have some data indicating how big stuff is/would be.
>>>>>>
>>>>>> Best
>>>>>> Erick
>>>>>>
>>>>>> On Wed, Jun 1, 2011 at 4:35 PM, Alexander Rosemann
>>>>>> <alexander.rosemann@gmail.com>      wrote:
>>>>>>>
>>>>>>> Hi all, I was wondering whether you could give me some advice
on 
>>>>>>> how to
>>>>>>> improve my search performance.
>>>>>>>
>>>>>>> I have 90 lucene indexes, each having different fields (~5 per
>>>>>>> Document).
>>>>>>> When I search, I always have to go through all indexes to build
my
>>>>>>> result
>>>>>>> set. Searching one index takes approx. 100ms, thus searching
all
>>>>>>> indexes
>>>>>>> takes 9s in total.
>>>>>>>
>>>>>>> How can I reduce the time it needs to search?
>>>>>>>
>>>>>>> I decided to create this many indexes because putting all data

>>>>>>> in one
>>>>>>> index
>>>>>>> would mean that a document would have ~400 fields, with most
of 
>>>>>>> them
>>>>>>> left
>>>>>>> empty. Is that ok? Would a single index be faster compared to

>>>>>>> multiple
>>>>>>> small
>>>>>>> ones?
>>>>>>>
>>>>>>> Any pointers are much appreciated.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Alex
>>>>>>>
>>>>>>> ---------------------------------------------------------------------

>>>>>>>
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------

>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message