lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Bernard <emman...@hibernate.org>
Subject Re: Improving search performance
Date Fri, 30 May 2008 21:18:37 GMT
Not the IndexSearch directly but you can pool the underlying  
IndexReaders which should lead to the same order of performance. You  
need to implement a ReaderProvider implementation (see  
SharedReaderProvider or NotSharedReaderProvider as example) and use  
hibernate.search.reader.strategy to define your ReaderProvider  
classname.

Let's continue the conversation if needed here http://forum.hibernate.org/viewtopic.php?t=987104
Emmanuel

On  May 29, 2008, at 10:49, Rakesh Shete wrote:

>
> Hi Emmanuel,
>
> Thanks for sparing time for this. Atleast now it looks like the  
> problem is clear. I will definitley try the pooled IndexSearch  
> approach.
>
> Could you let me know if there is a way of providing the  
> indexsearcher instance to the Hibernate Search FullTextQuery API?
>
> If that's not possible then probably I will have to use Lucene APIs  
> directly. Atleast from this thread it seems to be so:
> http://forum.hibernate.org/viewtopic.php?t=987104
>
> --Rakesh S
>
>
>> From: emmanuel@hibernate.org
>> To: java-user@lucene.apache.org
>> Subject: Re: Improving search performance
>> Date: Thu, 29 May 2008 02:22:19 -0400
>>
>> Hi Rakesh,
>> I've spend the afternoon and the evening playing around your test
>> because I could not stand Hibernate Search to be significantly slower
>> than native Lucene ;)
>> I found several causes but as far as your test case is concerned, it
>> turns out you are reaching the scalability limit of a single
>> IndexSearcher as Otis mentioned a couple of emails ago. 100 threads  
>> in
>> parallel are too much. You are better off using a pool of  
>> IndexSearcher.
>>
>> How much? It's hard to say, because the less  IndexSearchers, the
>> better you benefit from the warm up effect. I guess the optimal value
>> depends on your index size and the number of expected concurrent
>> threads. Give it a try with different values.
>>
>> Not related to this thread specifically but I have kept notes of some
>> ideas to bring back HSearch and Lucene neck to neck performance wise.
>> I've also explained there why your HSearch vs Lucene perf test was  
>> not
>> fair.
>> http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-194
>>
>> HTH
>>
>> Emmanuel
>>
>> On  May 25, 2008, at 04:53, Rakesh Shete wrote:
>>
>>>
>>> Hi ,
>>>
>>> I have written a program to simulate the multi-thread behavior and
>>> check the performance.
>>> I have attached the program and results.
>>>
>>> The JVM parameters when running the program: -Xms128m -Xmx1028m on a
>>> P4 2.66 Ghz, 1.99GB RAM machine.
>>>
>>> Here are the observations:
>>> 1. The response time to fire a query and return back increases as
>>> the number of threads increases: from 10-100 (I haven’t used any
>>> threadpool. JVM default behavior is used)
>>> 2. After optimizing the index, the response time improved
>>> significantly. Like in case of 100 threads, the average response
>>> time, using optimized index, is between 1-1.3 seconds as compared to
>>> 4-5.5 seconds using the non-optimized index
>>>
>>> Is there anything else that I should be looking into to further
>>> improve the performance?
>>>
>>> --Rakesh S
>>>
>>>> Date: Fri, 23 May 2008 20:24:03 -0700
>>>> From: otis_gospodnetic@yahoo.com
>>>> Subject: Re: Improving search performance
>>>> To: java-user@lucene.apache.org
>>>>
>>>> Hi Emmanuel,
>>>>
>>>> Because there are some synchronized methods, like the one that
>>>> checks whether a doc is deleted, that get called during search. If
>>>> you have a pile of threads (op. p. mentioned 100 threads) there
>>>> could be contention around those methods.
>>>>
>>>> Otis
>>>> --
>>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>>
>>>>
>>>> ----- Original Message ----
>>>>> From: Emmanuel Bernard
>>>>> To: java-user@lucene.apache.org
>>>>> Sent: Friday, May 23, 2008 6:41:36 PM
>>>>> Subject: Re: Improving search performance
>>>>>
>>>>> Hi
>>>>> Hibernate Search does not pool the Searcher but pools the  
>>>>> underlying
>>>>> IndexReader(s). From what i've seen, a Searcher is stateless and  
>>>>> all
>>>>> the state is kept in the Readers. so this essentially is
>>>>> equivalent to
>>>>> reusing the searcher.
>>>>>
>>>>> Out of curiosity why is a pool of Searcher more efficient?
>>>>>
>>>>> Emmanuel
>>>>>
>>>>> On May 22, 2008, at 13:22, Otis Gospodnetic wrote:
>>>>>
>>>>>> Some quick feedback. Those are all very expensive queries
>>>>>> (wildcards and ranges). The first thing I'd do is try without
>>>>>> Hibernate Search (to make sure HS is not the bottleneck). 100
>>>>>> threads is a lot, I'm guessing you are reusing your searcher,  
>>>>>> which
>>>>>> is good, but you will actually improve performance a bit if you
>>>>>> work
>>>>>> with a small pool of searchers instead of a single searcher.
>>>>>>
>>>>>> Otis
>>>>>> --
>>>>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>>>>
>>>>>>
>>>>>> ----- Original Message ----
>>>>>>> From: Rakesh Shete
>>>>>>> To: java-user-info@lucene.apache.org; java- 
>>>>>>> user@lucene.apache.org
>>>>>>> Sent: Thursday, May 22, 2008 1:16:13 PM
>>>>>>> Subject: Improving search performance
>>>>>>>
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I have index of size 85MB. My query looks as follows:
>>>>>>>
>>>>>>> +(t:boss* d:boss* dd:boss* tg:boss*) +st:act +ntid:0 +cid:1 +dr:
>>>>>>> [20080410 TO
>>>>>>> 20081010] +rT:[002 TO 005]
>>>>>>>
>>>>>>> All the fields used in the query are stored in the indexes
>>>>>>> (Indexed
>>>>>>> & Stored)
>>>>>>>
>>>>>>> The query response time for me is around 30 seconds when running
>>>>>>> mutliple
>>>>>>> simultanoeous threads (~100). The no. of matches is ~30k but
I
>>>>>>> retrieve only the
>>>>>>> top 100 results. I am using Hibernate Search which is a wrapper
>>>>>>> around Lucene. I
>>>>>>> retrieve the "id" filed from the index which is also indexex
and
>>>>>>> stored.
>>>>>>>
>>>>>>> What is the approach that I should take for improving the
>>>>>>> performance?
>>>>>>>
>>>>>>> Will just indexing the values without storing them work (Index
&
>>>>>>> UnStored)?
>>>>>>>
>>>>>>> My machine configuration is:
>>>>>>> P4 2.66GHz 1.99 GB RAM
>>>>>>>
>>>>>>> The code for searching runs in JBoss application server which
>>>>>>> has a
>>>>>>> maximum heap
>>>>>>> size of 1024MB. When these 100 threads are running in the
>>>>>>> application server the
>>>>>>> CPU utilization is 100% and JBoss consumes all of the heap size.
>>>>>>>
>>>>>>> Any pointers on index optimization would be really appreciated.
>>>>>>>
>>>>>>> --Regards,
>>>>>>> Rakesh Shete
>>>>>>>
>>>>>>> _________________________________________________________________
>>>>>>> No Harvard, No Oxford. We are here. Find out !!
>>>>>>> http://ss1.richmedia.in/recurl.asp?pid=500
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> _________________________________________________________________
>>> Catch the latest fashion shows, get beauty tips and learn more on
>>> fashion and lifestyle.
>>> http://video.msn.com/?mkt=en-in<Performance
>>> logs
>>> .zip
>>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> _________________________________________________________________
> 2000 Placements last year. Are You next ? Find out
> http://ss1.richmedia.in/recurl.asp?pid=499


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message