lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8286) UnifiedHighlighter should support the new Weight.matches API for better match accuracy
Date Wed, 02 May 2018 21:35:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461646#comment-16461646
] 

David Smiley commented on LUCENE-8286:
--------------------------------------

{quote}there is no way to retrieve the term/query in the matches iterator
{quote}
Oh I see – this was removed in LUCENE-8270!  I was loosely following the related issues
but overlooked that.    [~romseygeek] the statement in the description "we don't have
a clear use-case for this yet" surprises me; it's clearly _highlighting_; no?  Despite
this blocker, maybe I could put together a patch here, one that has poor scoring because we
don't know the term, and that will help identify how a matchesIterator.term() could be used?
{quote}One thing we could do to simplify the transition is to remove OffsetsEnum entirely
and replace it with the MatchesIterator, appart from the missing bits I described above this
should be easy to do.
{quote}
Or make OE extend MatchesIterator?  It has things we need – term(), freq().  MI has things
we don't need – position spans, but these can be ignored.
{quote}we can't easily use term vectors for a single field with Matches.
{quote}
Interesting; I'll take a closer look.

> UnifiedHighlighter should support the new Weight.matches API for better match accuracy
> --------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8286
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8286
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>            Reporter: David Smiley
>            Priority: Major
>
> The new Weight.matches() API should allow the UnifiedHighlighter to more accurately highlight
some BooleanQuery patterns correctly -- see LUCENE-7903.
> In addition, this API should make the job of highlighting easier, reducing the LOC and
related complexities, especially the UH's PhraseHelper.  Note: reducing/removing PhraseHelper
is not a near-term goal since Weight.matches is experimental and incomplete, and perhaps we'll
discover some gaps in flexibility/functionality.
> This issue should introduce a new UnifiedHighlighter.HighlightFlag enum option for this
method of highlighting.   Perhaps call it {{WEIGHT_MATCHES}}?  Longer term it could go away
and it'll be implied if you specify enum values for PHRASES & MULTI_TERM_QUERY?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message