lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Updated] (LUCENE-4798) PostingsHighlighter's formatter sometimes doesnt highlight matched terms
Date Tue, 26 Feb 2013 03:42:13 GMT


Robert Muir updated LUCENE-4798:

    Attachment: LUCENE-4798.patch

quick patch, just sorts by offset before handing to the formatter.

I added the simple test case i found, but added an assert to the random test too which easily
tripped on the bug.
> PostingsHighlighter's formatter sometimes doesnt highlight matched terms
> ------------------------------------------------------------------------
>                 Key: LUCENE-4798
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/highlighter
>            Reporter: Robert Muir
>         Attachments: LUCENE-4798.patch
> This can happen if you have a sentence where the query terms match many times in the
same sentence:
> for example if you query on "testing highlighter" but you have
> "Testing highlighters is sometimes harder than testing other things."
> The issue is that the formatter receives all 3 matches, but in this order:
> Testing (first occurrence)
> testing (second occurrence)
> highlighters
> The formatter expects the matches to be in sorted order by offset (not by term, then
offset). This is how the javadocs say they should be.
> But there is currently a bug, a stupid side effect of how the ranking is done. Because
of this, in this example "highlighters" isnt marked up in bold.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message