lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivaylo Zlatev" <>
Subject find attached an improved PriorityQueue object
Date Wed, 13 Feb 2002 01:43:38 GMT

Today I poked into the org.apache.lucene.util.PriorityQueue object and
decided that it could be written much better.
Not from alghoritmic standpoint, but from good programing practices
Find attached the new version. You will love it.
Changes I made:

1. It is not an abtract class any more. The abstract method
abstract protected boolean lessThan(Object a, Object b);
has been removed. The idea is that the put(...) methid now accepts an
implementing the Comparable interface. Hence, the a.compareTo(Object b)
is used instead of the lessThan(Object a, Object b) method.
All you have to do is to write a short compareTo(Object b) method in
ScoreDoc, simmilar to the lessThan(...) method in
Now HitQueue is obsolete and the Searchers should use directly the
PriorityQueue object.

2. Up until now, the Searchers were responsible to call pop() after
put() to maintain the
PriorityQueue size constant. This seems incorrect. I modified the put()
method to do a 
pop() when the PriorityQueue is full; The popped element is actually
returned by the
put() metohd (like a side effect of put()). 

3. I modified the length of the heap array to be maxSize+2. Until now it
was (maxSize*2)+1 , which is a big waste of memory. I understand that
these are just references to null-s, but still, it's a major improvement
in Lucene. If a query returned a milion hits and all of them were
requested (nDocs=1000000), the heap array was of lenght 2000000!
During my tests this array size worked well.

4. Now the PriorityQueue supports a maxSize of 0 (the old one was
crashing during put()), which is extremely
useful if you want to do a search and are interested just in the total
number of hits, not in the search results.
I am talking about the method
final TopDocs search(Query query, Filter filter, final int nDocs), when
you pass nDocs=0 .
Beware: This method might still crash when using the new PriorityQueue,
when nDocs is 0, because of the code, which 
iterates to call pop() to retrieve the top scored docs left in the
PriorityQueue. It must be tested and fixed.

5. I wrote a detailed main() method, which tests the PriorityQueue by
storing String-s inside it.
Strings implement already the Comparable interface and are perfect for
the test. I tested PriorityQueue-s with
sizes from 0 to 10, putting from 1 to 10 elements inside them and they
worked well. Note that I have not chaned any of the complex logic in the
upHeap() and downHeap() methods. A bug might come from the modification
of the heap array length, though.

I appreciate you comments.
Regards, Ivaylo Zlatev


View raw message