lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (LUCENE-8286) UnifiedHighlighter should support the new Weight.matches API for better match accuracy
Date Thu, 03 May 2018 13:50:00 GMT


David Smiley commented on LUCENE-8286:

The "span" width _could_ be used for passage relevancy, and perhaps ought to be – sure.
 I just meant to convey that today the UH doesn't have or use this info.

BTW I did a quick hack integration last night of Weight.getMatches into the UH and ran some
tests.  I had no issue with term vectors.   The fieldMatcher (aka requireFieldMatch option)
will require some work.  And if the query references non-highlighted fields in a way that
will constraint the results (i.e. MUST otherfield:foo), for the Analysis offset strategy,
we'll need to combine an aggregate index view of analysis with the underlying real index for
other fields because the MemoryIndex alone only has one field – the field being highlighted.

> UnifiedHighlighter should support the new Weight.matches API for better match accuracy
> --------------------------------------------------------------------------------------
>                 Key: LUCENE-8286
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>            Reporter: David Smiley
>            Priority: Major
> The new Weight.matches() API should allow the UnifiedHighlighter to more accurately highlight
some BooleanQuery patterns correctly -- see LUCENE-7903.
> In addition, this API should make the job of highlighting easier, reducing the LOC and
related complexities, especially the UH's PhraseHelper.  Note: reducing/removing PhraseHelper
is not a near-term goal since Weight.matches is experimental and incomplete, and perhaps we'll
discover some gaps in flexibility/functionality.
> This issue should introduce a new UnifiedHighlighter.HighlightFlag enum option for this
method of highlighting.   Perhaps call it {{WEIGHT_MATCHES}}?  Longer term it could go away
and it'll be implied if you specify enum values for PHRASES & MULTI_TERM_QUERY?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message