lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chuck Williams <ch...@manawiz.com>
Subject Re: Strange behavior of positionIncrementGap
Date Sat, 12 Aug 2006 19:41:00 GMT


Yonik Seeley wrote on 08/12/2006 05:08 AM:
> On 8/11/06, Chuck Williams <chuck@manawiz.com> wrote:
>> 1) a b C D ...results in:  _gap_ _gap_ C _gap_ D
>> 2) a B C D ...results in:  _gap_ B _gap_ C _gap_ D
>> 3) A b c D ...results in:  A _gap_ _gap_ _gap_ D
>>
>> This seems a natural behavior and is consistent with the use cases you
>> describe (which are essentially the same reason I'm using gaps, and
>> presumably the main purpose of gaps).
>>
>> Hoss, do you think it would be ok to fix given the potential upward
>> incompatibility for index-format-dependent implementaitons?
>
> The proposed behavior seems fine to me...
> Is there any incompatibility other than the position gaps changing in
> the presence of fields empty after analysis?
No, the only change is the insertion of extra gaps between empty field
values at the beginning of a list of values for a given field name. 
There are already such gaps between empty field values that occur
anywhere other than the beginning of the field value list.  The idea is
to treat empty field values consistently, independent of whether they
occur at the beginning of the list or somewhere else in the list.

I'll post a patch for you guys to review, hopefully sometime this
weekend.  The one downside of the patch is that it may require some
extra tracking data in DocumentWriter.invertDocument().  I think the
current behavior may have been coded as an expedient to reuse the
existing length tracking for the max-field length check.

Thanks,

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message