lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: TermDocs.skipTo()
Date Mon, 05 Apr 2004 20:42:40 GMT
Christoph Goller wrote:
> Problem: In TermInfosReader index (every 128th term) skipOffsets are not
> stored! Due to documentation getIndexOffset returns the offset of the 
> greatest
> index entry which is less than term. I believe this is not true it may
> deliver the term itself! If we seek for a term that is in the index, this
> term and its termInfo will not be read from the enumerator by scanEnum and
> consequently no skipOffset will be found, even if present. This could lead
> to serious problems when skipTo is used, couldnĀ“t it?

Yes, this does look like a problem.

> Possible Solution: Store skipOffset in *.tii too.

I think that's a good solution.  We should change TermInfosWriter.FORMAT 
from -1 to -2 and then use that to keep SegmentTermEnum.next() 
back-compatible, since folks may have created indexes with 1.4RC2.  The 
simplest way to do this would be to disable skipTo() when 
TermInfosWriter.FORMAT is -1, by setting skipInterval to 
Integer.MAX_VALUE, as is done for 1.3 indexes.

Shall I do this, or would you like to?

Thanks so much for finding things like this!

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message