lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Searching is taking a lot...
Date Thu, 29 Jun 2006 07:00:28 GMT
On Thursday 29 June 2006 06:17, James Pine wrote:
> A HitCollector object invokes its collect method on
> every document which matches the query/filter
> submitted to the Searcher.search method. I think all
> you would need to do is pass in the page number and
> results per page to your HitCollector constructor and
> then in the collect method do the bookeeping to keep
> track of where you are in the result set. A simple
> version of your HitCollector might look like this:
> 
> public class MyHitCollector extends HitCollector {
> 
>   private int docCount = 0;
>   private List documents = new ArrayList();
>   private int startDoc;
>   private int endDoc;
>   private Searcher searcher;
> 
>   public MyHitCollector(Searcher searcher, int
> requestedPageNumber, int resultsPerPage) {
>     startDoc = (requestedPageNumber - 1) *
> resultsPerPage;
>     endDoc = requestedPageNumber * resultsPerPage;
>     this.searcher = searcher;
>   }
> 
>   public void collect(int id, float score) {
>     if(docCount >= startDoc && docCount < endDoc) {
>       documents.add(searcher.doc(id));

This will break performance. It is better to first collect all the document
numbers (code without the proper declarations):

 public void collect(int id, float score) {
     if(docCount >= startDoc && docCount < endDoc) {
         docNrs.add(id); // or use int[] docNrs when possible.
  ....

and later retrieve them after collecting (pseudo code):

        for docNr in docNrs {
          documents.add(searcher.doc(docNr));
        }

Doing things this way avoids the disk head moving up and down between
different parts of the index during the collection.
Also, make sure to call searcher.doc(docNr) with sorted docNrs, i.e.
there is normally no need to change the order of the collected docNrs.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message