lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierre Gossé (JIRA) <j...@apache.org>
Subject [jira] [Created] (LUCENE-3087) highlighting exact phrase with overlapping tokens fails.
Date Wed, 11 May 2011 13:47:47 GMT
highlighting exact phrase with overlapping tokens fails.
--------------------------------------------------------

                 Key: LUCENE-3087
                 URL: https://issues.apache.org/jira/browse/LUCENE-3087
             Project: Lucene - Java
          Issue Type: Bug
          Components: contrib/highlighter
    Affects Versions: 3.1, 2.9.4
            Reporter: Pierre Gossé
            Priority: Minor


Fields with overlapping token are not highlighted in search results when searching exact phrases,
when using TermVector.WITH_OFFSET.

The document builded in MemoryIndex for highlight does not preserve positions of tokens in
this case. Overlapping tokens get "flattened" (position increment always set to 1), the spanquery
used for searching relevant fragment will fail to identify the correct token sequence because
the position shift.

I corrected this by adding a position increment calculation in sub class StoredTokenStream.
I added junit test covering this case.

I used the eclipse codestyle from trunk, but style add quite a few format differences between
repository and working copy files. I tried to reduce them, but some linewrapping rules still
doesn't match.

Correction patch joined

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message