lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: TermDocs.skipTo()
Date Thu, 08 Apr 2004 18:15:32 GMT
Christoph Goller wrote:
> Daniel found a bug today and therefore I reviewed skipTo once again.

Thanks!

> Here are some further things to consider:
> 
> *) MultiTermDocs.skipTo could easily be optimized too, couldn\x{00B4}t it?

Yes, I think so.  I think I forgot to look at that .

> *) SegmentTermDocs: skipStream never closed

You're right, it should be.

> *) SegmentTermPositions: seek(Terminfo): probably should always make
> proxCount = 0;

Right again.

I can't think why I ever did it that way...  It was done as a fix for:

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6292

http://cvs.apache.org/viewcvs.cgi/jakarta-lucene/src/java/org/apache/lucene/index/SegmentTermPositions.java?r1=1.2&r2=1.3

> *) I think due to your last changes SegmentTermDocs makes one skip less 
> than is required? However, I haven´t tested this.
> 
> while (target > skipDoc && skipCount < numSkips) {
>         lastSkipDoc = skipDoc;
>         lastFreqPointer = freqPointer;
>         lastProxPointer = proxPointer;
> 
>         if (skipDoc != 0 && skipDoc >= doc)
>           numSkipped += skipInterval;
> 
>         skipDoc += skipStream.readVInt();
>         freqPointer += skipStream.readVInt();
>         proxPointer += skipStream.readVInt();
> 
>         skipCount++;
>       }
> 
>       // if we found something to skip, then skip it
>       if (lastFreqPointer > freqStream.getFilePointer()) {
>         freqStream.seek(lastFreqPointer);
>         skipProx(lastProxPointer);
> 
>         doc = lastSkipDoc;
>         count += numSkipped;
>       }
> 
> Consider exit of while because of skipCount == numSkips. Then doc 
> becomes lastSkipDoc not skipDoc!

That sounds reasonable.  I'm sure having trouble getting this method 
right!  So do you think this loop should be changed to something like:

   while (target > skipDoc) {
     lastSkipDoc = skipDoc;
     ...

     if (skipCount > numSkips)
       break;

     skipDoc += skipStream.readVInt();
     ...
    }

That looks better to me...  What do you think?

> *) PhraseScorer.skipTo jumps one doc too far because of call to sort() 
> which calls next for each PhrasePosition. Here is Daniels test that 
> demonstrates this:  [ ... ]
> 
> Instead of 1 hit, 0 hits are found with 1.4rc2, while 1.3 finds the hit. I
> committed the necessary change to PhraseScorer already and it fixes the 
> problem.

Thanks!  If you have a chance, please add this as a unit test too.

> Unfortunately, I haven´t found the time to restructure the IndexReaders 
> so far.

Thanks again for all your work.  You're helping to make Lucene much more 
reliable!

Cheers,

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message