lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Bernard <emman...@hibernate.org>
Subject Re: Improving search performance
Date Thu, 29 May 2008 06:22:19 GMT
Hi Rakesh,
I've spend the afternoon and the evening playing around your test  
because I could not stand Hibernate Search to be significantly slower  
than native Lucene ;)
I found several causes but as far as your test case is concerned, it  
turns out you are reaching the scalability limit of a single  
IndexSearcher as Otis mentioned a couple of emails ago. 100 threads in  
parallel are too much. You are better off using a pool of IndexSearcher.

How much? It's hard to say, because the less  IndexSearchers, the  
better you benefit from the warm up effect. I guess the optimal value  
depends on your index size and the number of expected concurrent  
threads. Give it a try with different values.

Not related to this thread specifically but I have kept notes of some  
ideas to bring back HSearch and Lucene neck to neck performance wise.  
I've also explained there why your HSearch vs Lucene perf test was not  
fair.
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-194

HTH

Emmanuel

On  May 25, 2008, at 04:53, Rakesh Shete wrote:

>
> Hi ,
>
> I have written a program to simulate the multi-thread behavior and  
> check the performance.
> I have attached the program and results.
>
> The JVM parameters when running the program: -Xms128m -Xmx1028m on a  
> P4 2.66 Ghz, 1.99GB RAM machine.
>
> Here are the observations:
> 1. The response time to fire a query and return back increases as  
> the number of threads increases: from 10-100 (I haven’t used any  
> threadpool. JVM default behavior is used)
> 2. After optimizing the index, the response time improved  
> significantly. Like in case of 100 threads, the average response  
> time, using optimized index, is between 1-1.3 seconds as compared to  
> 4-5.5 seconds using the non-optimized index
>
> Is there anything else that I should be looking into to further  
> improve the performance?
>
> --Rakesh S
>
>> Date: Fri, 23 May 2008 20:24:03 -0700
>> From: otis_gospodnetic@yahoo.com
>> Subject: Re: Improving search performance
>> To: java-user@lucene.apache.org
>>
>> Hi Emmanuel,
>>
>> Because there are some synchronized methods, like the one that  
>> checks whether a doc is deleted, that get called during search. If  
>> you have a pile of threads (op. p. mentioned 100 threads) there  
>> could be contention around those methods.
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>> ----- Original Message ----
>>> From: Emmanuel Bernard
>>> To: java-user@lucene.apache.org
>>> Sent: Friday, May 23, 2008 6:41:36 PM
>>> Subject: Re: Improving search performance
>>>
>>> Hi
>>> Hibernate Search does not pool the Searcher but pools the underlying
>>> IndexReader(s). From what i've seen, a Searcher is stateless and all
>>> the state is kept in the Readers. so this essentially is  
>>> equivalent to
>>> reusing the searcher.
>>>
>>> Out of curiosity why is a pool of Searcher more efficient?
>>>
>>> Emmanuel
>>>
>>> On May 22, 2008, at 13:22, Otis Gospodnetic wrote:
>>>
>>>> Some quick feedback. Those are all very expensive queries
>>>> (wildcards and ranges). The first thing I'd do is try without
>>>> Hibernate Search (to make sure HS is not the bottleneck). 100
>>>> threads is a lot, I'm guessing you are reusing your searcher, which
>>>> is good, but you will actually improve performance a bit if you  
>>>> work
>>>> with a small pool of searchers instead of a single searcher.
>>>>
>>>> Otis
>>>> --
>>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>>
>>>>
>>>> ----- Original Message ----
>>>>> From: Rakesh Shete
>>>>> To: java-user-info@lucene.apache.org; java-user@lucene.apache.org
>>>>> Sent: Thursday, May 22, 2008 1:16:13 PM
>>>>> Subject: Improving search performance
>>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I have index of size 85MB. My query looks as follows:
>>>>>
>>>>> +(t:boss* d:boss* dd:boss* tg:boss*) +st:act +ntid:0 +cid:1 +dr:
>>>>> [20080410 TO
>>>>> 20081010] +rT:[002 TO 005]
>>>>>
>>>>> All the fields used in the query are stored in the indexes  
>>>>> (Indexed
>>>>> & Stored)
>>>>>
>>>>> The query response time for me is around 30 seconds when running
>>>>> mutliple
>>>>> simultanoeous threads (~100). The no. of matches is ~30k but I
>>>>> retrieve only the
>>>>> top 100 results. I am using Hibernate Search which is a wrapper
>>>>> around Lucene. I
>>>>> retrieve the "id" filed from the index which is also indexex and
>>>>> stored.
>>>>>
>>>>> What is the approach that I should take for improving the
>>>>> performance?
>>>>>
>>>>> Will just indexing the values without storing them work (Index &
>>>>> UnStored)?
>>>>>
>>>>> My machine configuration is:
>>>>> P4 2.66GHz 1.99 GB RAM
>>>>>
>>>>> The code for searching runs in JBoss application server which  
>>>>> has a
>>>>> maximum heap
>>>>> size of 1024MB. When these 100 threads are running in the
>>>>> application server the
>>>>> CPU utilization is 100% and JBoss consumes all of the heap size.
>>>>>
>>>>> Any pointers on index optimization would be really appreciated.
>>>>>
>>>>> --Regards,
>>>>> Rakesh Shete
>>>>>
>>>>> _________________________________________________________________
>>>>> No Harvard, No Oxford. We are here. Find out !!
>>>>> http://ss1.richmedia.in/recurl.asp?pid=500
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> _________________________________________________________________
> Catch the latest fashion shows, get beauty tips and learn more on  
> fashion and lifestyle.
> http://video.msn.com/?mkt=en-in<Performance  
> logs 
> .zip 
> >---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message