lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Woodward (JIRA)" <>
Subject [jira] [Commented] (LUCENE-8401) Add PassageBuilder to help construct highlights using MatchesIterator
Date Mon, 16 Jul 2018 10:27:00 GMT


Alan Woodward commented on LUCENE-8401:

Here is a patch containing the following classes:
 * Passage -> a representation of a highlight snippet, with text, start and end offset,
and a list of internal hits
 * PassageBreaker -> an interface that defines where passages should start and end, and
if a hit should be added to the current passage or should start a new one
 * PassageBuilder -> public API that iteratively returns new Passage objects
 * BreakIteratorPassageBreaker -> simple implementation of PassageBreaker

The passage builder uses a wrapper around its MatchesIterator to enable it to peek at the
position of the next hit, which means that we can improve clustering for hits that looks like
[A .............. B .. C], where A and B are within the maximum snippet size, but A and C
are not.

> Add PassageBuilder to help construct highlights using MatchesIterator
> ---------------------------------------------------------------------
>                 Key: LUCENE-8401
>                 URL:
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/highlighter
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Major
>         Attachments: LUCENE-8401.patch
> Jim and I discussed a while back the idea of adding highlighter components, rather than
a fully-fledged highlighter, which would allow users to build their own specialised highlighters.  To
that end, I'd like to add a PassageBuilder class that uses the Matches API to break text up
into passages containing hits.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message