lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Satuluri, Venu_Madhav" <Venu.Madhav.Satul...@deshaw.com>
Subject RE: Improving search performance
Date Tue, 21 Mar 2006 14:09:41 GMT
You are right. I was unnecessarily transferring all the results from
Hits object to an ArrayList. I don't know why it never struck me but
this was the step that was taking a lot of time; it was staring at me
all the time.

Thanks, it's running much better now.

Venu 

-----Original Message-----
From: Grant Ingersoll [mailto:gsingers@syr.edu] 
Sent: Tuesday, March 21, 2006 7:03 PM
To: java-user@lucene.apache.org
Subject: Re: Improving search performance


I am not sure why you are getting all 60k docs at a time.  If you use 
the Hits object, it caches the top 50 or so, but doesn't retrieve all 
the documents at once.

Also, what are the size of your fields and how many fields do you have 
per document? 

Have you done any profiling to find the bottlenecks?  An index size of 
50mb is actually pretty small for Lucene, perhaps you can share more 
about your setup.

-Grant

Satuluri, Venu_Madhav wrote:
> Hi,
>
> I am looking for ways to improve the performance of lucene search in
our
> app. Lucene performance is visibly slow when there are a lot of
> documents to be returned (performance almost seems directly
proportional
> to the number of documents returned by Searcher). However, we show 20
> results per page, so it seems to be a waste of time to get all, say,
> 60,000 documents when all I need are the nth 20 (i.e. if user requests
> 4th page of results, I just need documents 61 to 80).  I've tried
using
> the Searcher.search() method that returns TopFieldDocs. This method
> works much faster than the ordinary Searcher.search() that returns all
> the results. The trouble with this method is that I cant use it to get
> an arbitrary portion of the results, I can only get the top few docs.
>
> Our index size is around 50 MB. Its optimized every one hour. I have
> tried a RAMDirectory, but even though using this improves performance
by
> upto 2 times, its not good enough. Also our index gets modified by
> multiple processes so keeping the RAMDirectory up-to-date is a hassle.
>
> Thanks,
> Venu
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message