lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (LUCENE-2529) always apply position increment gap between values
Date Tue, 05 Oct 2010 22:21:34 GMT


Robert Muir commented on LUCENE-2529:

David, sorry to make you repeat yourself.

I'm not sure about this change though, for example it breaks contrib/highlighter tests (I
didnt look at the test to see more details).

However, if we decide to do it in the future, I think we should remove these special checks
from the main loop:
For example, instead of:
if (firstToken && i == 0)//i.e. this is the very first token we emit for this field
in this document
              fieldState.position--;//we want to start at 0, not 1

this could check hasMoreTokens && i == 0 before the loop, so its not checked for every
token in the document.
in a similar sense I think the "correct < 0 to 0" check should probably be outside of the
loop, since it can only really happen for the first term.

> always apply position increment gap between values
> --------------------------------------------------
>                 Key: LUCENE-2529
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.9.3, 3.0.2, 3.1, 4.0
>         Environment: (I don't know which version to say this affects since it's some
quasi trunk release and the new versioning scheme confuses me.)
>            Reporter: David Smiley
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>         Attachments: LUCENE-2529_always_apply_position_increment_gap_between_values.patch,
LUCENE-2529_nonsenseIncrements.patch, LUCENE-2529_skip_posIncr_for_1st_token.patch, LUCENE-2529_skip_posIncr_for_1st_token.patch,
LUCENE-2529_skip_posIncr_for_1st_token.patch, LUCENE-2529_test.patch
>   Original Estimate: 1h
>  Remaining Estimate: 1h
> I'm doing some fancy stuff with span queries that is very sensitive to term positions.
 I discovered that the position increment gap on indexing is only applied between values when
there are existing terms indexed for the document.  I suspect this logic wasn't deliberate,
it's just how its always been for no particular reason.  I think it should always apply the
gap between fields.  Reference line 82:
> if (fieldState.length > 0)
>           fieldState.position += docState.analyzer.getPositionIncrementGap(;
> This is checking fieldState.length.  I think the condition should simply be:  if (i >
> I don't think this change will affect anyone at all but it will certainly help me.  Presently,
I can either change this line in Lucene, or I can put in a hack so that the first value for
the document is some dummy value which is wasteful.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message