lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@lucene.com>
Subject Re: RE : TR : Possible Bug with MultiSearcher?
Date Thu, 19 Sep 2002 22:05:27 GMT


Rasik Pandey wrote:
> Developers,
> Attached is the diff for MultiSearcher which seems to correct these
> bugs. I have not yet found any problems caused by these changes in
> testing........but we will keep you informed!

[ ... ]

diff -w -r1.4 MultiSearcher.java
96,98c100,105
<   public final Document doc(int n) throws IOException {
<     int i = searcherIndex(n);			  // find searcher index
<     return searchers[i].doc(n - starts[i]);
---
 >   public Document doc(int n) throws IOException {
 >     int i = searcherIndex(n);
 >     if (i < 0 || i > searchables.length)
 >         return null;
 >     else
 >       return searchables[i].doc(n - starts[i]);

I don't agree with this change.  If you pass in a bogus document number 
then an exception should be thrown here.  ArrayIndexOutOfBounds is not 
the friendliest of exceptions, but it's better than returning null.

> 113c120
> <       else if (n > midValue)
> ---
> 
>>      else if (n >= midValue) //made this a ">=", to ensure that a valid index
is found

This is the standard binary search algorithm, and should not be changed 
lightly (according to Knuth, no less).  The problem is when there are 
multiple entries with the same value then this algorithm doesn't always 
return the index of the largest one, which is required in this case.  So 
I think the fix is to scan for the largest value:

--- MultiSearcher.java.~1.5.~	2002-07-17 10:28:56.000000000 -0700
+++ MultiSearcher.java	2002-09-19 15:02:02.000000000 -0700
@@ -116,9 +116,13 @@
  	hi = mid - 1;
        else if (n > midValue)
  	lo = mid + 1;
-      else
+      else {
+        while (mid+1 < searchables.length && starts[mid+1] == midValue) {
+          mid++;                                  // return last match
+        }
  	return mid;
      }
+    }
      return hi;
    }

That passes the test suite just checked in too!

Any objections to this?

Doug


--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message