lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Goller <gol...@detego-software.de>
Subject Re: TermDocs.skipTo()
Date Wed, 07 Apr 2004 10:03:47 GMT
Doug Cutting wrote:
> Christoph Goller wrote:
> 
>> Problem: In TermInfosReader index (every 128th term) skipOffsets are not
>> stored! Due to documentation getIndexOffset returns the offset of the 
>> greatest
>> index entry which is less than term. I believe this is not true it may
>> deliver the term itself! If we seek for a term that is in the index, this
>> term and its termInfo will not be read from the enumerator by scanEnum 
>> and
>> consequently no skipOffset will be found, even if present. This could 
>> lead
>> to serious problems when skipTo is used, couldnĀ“t it?
> 
> 
> Yes, this does look like a problem.
> 
>> Possible Solution: Store skipOffset in *.tii too.
> 
> 
> I think that's a good solution.  We should change TermInfosWriter.FORMAT 
> from -1 to -2 and then use that to keep SegmentTermEnum.next() 
> back-compatible, since folks may have created indexes with 1.4RC2.  The 
> simplest way to do this would be to disable skipTo() when 
> TermInfosWriter.FORMAT is -1, by setting skipInterval to 
> Integer.MAX_VALUE, as is done for 1.3 indexes.
> 
> Shall I do this, or would you like to?

I would prefer to leave this task to you :-)

However, I am currently debugging/ stepping through a problem found by Daniel
with 1.4rc2. Maybe its caused by a skipTo() bug. I am not sure yet. Maybe its a 
bug in Conjunction Scorer. If I cannot solve the problem I will post it to the 
mailing list tonight.

What about the following agreement:
I try to restucture the IndexReader stuff as we already agreed,
you try to solve the skipTo() problem,
and then we review each others work.

Christoph



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message