lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: PATCH: PriorityQueue
Date Thu, 11 Sep 2003 10:42:36 GMT
Thank you, this all looks correct and elegant to me.
Patches applied, about to be checked in.

Please keep this stuff coming, if you notice any other problems.

Otis

--- Christoph Goller <goller@detego-software.de> wrote:
> Hi folks,
> 
> This is a suggestion how to make code with PriorityQueue more
> consistent
> and easier to understand. I am not talking about a bug. The current
> code
> seems to work fine.
> 
> Currently there are two different kinds of applications of
> PriorityQueue.
> In SegmentMergeQueue, PhraseQueue, and TermPositionsQueue one knows
> in advance the maximum number of elements in the queue and the queue
> is
> only used to order the elements. One is never tempted to add more
> elements
> than the specified number to the queue. With HitQueue one wants to
> find the
> n best hits within an a priori unknown number of hits.
> 
> My problem:
> 
> 1) PriorityQueue currently is initialized with a maximum size,
> however it
> actually can hold twice the number of elements. This feature is only
> used
> by HitQueue which currently must be able to hold one element more
> than
> maxSize. This is required by the code in IndexSearcher and
> MultiSearcher.
> 
> 2) I find the code in IndexSearcher and MultiSearcher for controlling
> the
> size of the queue a little bit confusing. Furthermore, if the queue
> is already
> full and a new element has a higher score than top(), it is more
> efficient
> to change the top-element and call adjustTop() than to add the new
> element
> and call pop() afterwards (what is currently done).
> 
> Suggestion:
> 
> IŽd like to introduce a clearer contract about the maxSize in
> PriorityQueue.
> PriorityQueue should be able to hold exactly maxSize elements and
> put()
> should throw an exception if one accidently tries to add more
> elements.
> Furthermore IŽd like to add a method insert() to PriorityQueue that
> only adds
> a new element if either the queue is not full or !lessThan(element,
> top()).
> This new method can then be used in IndexSearcher and MultiSearcher.
> 
> Patch files for PriorityQueue, IndexSearcher, MultiSearcher, and
> TestPriorityQueue (one additional test) are attached.
> 
> kind regards,
> Christoph
> 
> -- 
> *****************************************************************
> * Dr. Christoph Goller       Tel.:   +49 89 203 45734           *
> * Detego Software GmbH       Mobile: +49 179 1128469            *
> * Keuslinstr. 13             Fax.:   +49 721 151516176          *
> * 80798 München, Germany     Email:  goller@detego-software.de  *
> *****************************************************************
> > Index: PriorityQueue.java
> ===================================================================
> RCS file:
>
/home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/util/PriorityQueue.java,v
> retrieving revision 1.3
> diff -u -r1.3 PriorityQueue.java
> --- PriorityQueue.java	7 Nov 2002 05:55:40 -0000	1.3
> +++ PriorityQueue.java	4 Sep 2003 13:29:40 -0000
> @@ -60,6 +60,7 @@
>  public abstract class PriorityQueue {
>    private Object[] heap;
>    private int size;
> +  private int maxSize;
>  
>    /** Determines the ordering of objects in this priority queue. 
> Subclasses
>      must define this one method. */
> @@ -68,16 +69,41 @@
>    /** Subclass constructors must call this. */
>    protected final void initialize(int maxSize) {
>      size = 0;
> -    int heapSize = (maxSize * 2) + 1;
> +    int heapSize = maxSize + 1;
>      heap = new Object[heapSize];
> +    this.maxSize = maxSize;
>    }
>  
> -  /** Adds an Object to a PriorityQueue in log(size) time. */
> +  /**
> +   * Adds an Object to a PriorityQueue in log(size) time.
> +   * If one tries to add more objects than maxSize from initialize
> +   * a RuntimeException (ArrayIndexOutOfBound) is thrown.
> +   */
>    public final void put(Object element) {
>      size++;
>      heap[size] = element;
>      upHeap();
>    }
> +  
> +  /**
> +   * Adds element to the PriorityQueue in log(size) time if either
> +   * the PriorityQueue is not full, or !lessThan(element, top()).
> +   * @param element
> +   * @return true if element is added, false otherwise.
> +   */
> +  public boolean insert(Object element){
> +    if(size < maxSize){
> +        put(element);
> +        return true;
> +    }
> +    else if(size > 0 && !lessThan(element, top())){
> +        heap[1] = element;
> +        adjustTop();
> +        return true;
> +    }
> +    else
> +        return false;
> +   }
>  
>    /** Returns the least element of the PriorityQueue in constant
> time. */
>    public final Object top() {
> > Index: IndexSearcher.java
> ===================================================================
> RCS file:
>
/home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/search/IndexSearcher.java,v
> retrieving revision 1.9
> diff -u -r1.9 IndexSearcher.java
> --- IndexSearcher.java	24 May 2003 15:24:26 -0000	1.9
> +++ IndexSearcher.java	4 Sep 2003 12:41:44 -0000
> @@ -133,18 +133,11 @@
>      final HitQueue hq = new HitQueue(nDocs);
>      final int[] totalHits = new int[1];
>      scorer.score(new HitCollector() {
> -	private float minScore = 0.0f;
>  	public final void collect(int doc, float score) {
>  	  if (score > 0.0f &&			  // ignore zeroed buckets
>  	      (bits==null || bits.get(doc))) {	  // skip docs not in bits
>  	    totalHits[0]++;
> -	    if (score >= minScore) {
> -	      hq.put(new ScoreDoc(doc, score));	  // update hit queue
> -	      if (hq.size() > nDocs) {		  // if hit queue overfull
> -		hq.pop();			  // remove lowest in hit queue
> -		minScore = ((ScoreDoc)hq.top()).score; // reset minScore
> -	      }
> -	    }
> +            hq.insert(new ScoreDoc(doc, score));
>  	  }
>  	}
>        }, reader.maxDoc());
> > Index: MultiSearcher.java
> ===================================================================
> RCS file:
>
/home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/search/MultiSearcher.java,v
> retrieving revision 1.10
> diff -u -r1.10 MultiSearcher.java
> --- MultiSearcher.java	29 Jan 2003 17:18:54 -0000	1.10
> +++ MultiSearcher.java	4 Sep 2003 12:43:13 -0000
> @@ -144,7 +144,6 @@
>    public TopDocs search(Query query, Filter filter, int nDocs)
>        throws IOException {
>      HitQueue hq = new HitQueue(nDocs);
> -    float minScore = 0.0f;
>      int totalHits = 0;
>  
>      for (int i = 0; i < searchables.length; i++) { // search each
> searcher
> @@ -153,15 +152,9 @@
>        ScoreDoc[] scoreDocs = docs.scoreDocs;
>        for (int j = 0; j < scoreDocs.length; j++) { // merge
> scoreDocs into hq
>  	ScoreDoc scoreDoc = scoreDocs[j];
> -	if (scoreDoc.score >= minScore) {
> -	  scoreDoc.doc += starts[i];		  // convert doc
> -	  hq.put(scoreDoc);			  // update hit queue
> -	  if (hq.size() > nDocs) {		  // if hit queue overfull
> -	    hq.pop();				  // remove lowest in hit queue
> -	    minScore = ((ScoreDoc)hq.top()).score; // reset minScore
> -	  }
> -	} else
> -	  break;				  // no more scores > minScore
> +        scoreDoc.doc += starts[i];                      // convert
> doc
> +        if(!hq.insert(scoreDoc))
> +            break;                                                //
> no more scores > minScore
>        }
>      }
>  
> > Index: TestPriorityQueue.java
> ===================================================================
> RCS file:
>
/home/cvspublic/jakarta-lucene/src/test/org/apache/lucene/util/TestPriorityQueue.java,v
> retrieving revision 1.3
> diff -u -r1.3 TestPriorityQueue.java
> --- TestPriorityQueue.java	5 Jun 2002 01:50:54 -0000	1.3
> +++ TestPriorityQueue.java	4 Sep 2003 13:39:34 -0000
> @@ -135,4 +135,16 @@
>  	pq.clear();
>  	assertEquals(0, pq.size());
>      }
> +    
> +    public void testFixedSize(){
> +        PriorityQueue pq = new IntegerQueue(3);
> +        pq.insert(new Integer(2));
> +        pq.insert(new Integer(3));
> +        pq.insert(new Integer(1));
> +        pq.insert(new Integer(5));
> +        pq.insert(new Integer(7));
> +        pq.insert(new Integer(1));
> +        assertEquals(3, pq.size());
> +        assertEquals(3, ((Integer) pq.top()).intValue());
> +    }
>  }
> 
> >
---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

Mime
View raw message