lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Sokolov (JIRA)" <>
Subject [jira] [Updated] (LUCENE-3318) Sketch out highlighting based on term positions / position iterators
Date Sun, 07 Aug 2011 22:46:27 GMT


Mike Sokolov updated LUCENE-3318:

    Attachment: LUCENE-3318.patch

Updated patch now handles MultiTermQuerys nested (however deeply) in BooleanQuerys, and does
not modify the queries in order to rewrite them.

This approach can now be used for highlighting many types of queries (SpanQueries are a special
case - they do their own MTQ rewriting), but still does require inspection of query types.
 It might be nice in the future to provide a Query rewriting interface that allows the caller
more control over the kind of rewriting to be done.  With this, it would be possible to expect
future query types to implement a rewriting method that could support highlighting without
the need for instanceof, etc.

I added a simple white-space boundary detection that can be used to adjust snippet boundaries
to fall between words.  The basic idea could be extended to do sentence boundary detection.

I added a Highlighter parameter (maxFragsToScore) that limits the number of fragments considered
when attempting to find the highest-scoring one(s). This gives some decent speedup if you
just want to find the first fragment in a document.

> Sketch out highlighting based on term positions / position iterators
> --------------------------------------------------------------------
>                 Key: LUCENE-3318
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Sub-task
>          Components: modules/highlighter
>    Affects Versions: Positions Branch
>            Reporter: Simon Willnauer
>            Assignee: Mike Sokolov
>             Fix For: Positions Branch
>         Attachments: LUCENE-3318.patch, LUCENE-3318.patch, LUCENE-3318.patch
> Spinn off from LUCENE-2878. Since we have positions on a large number of queries already
in the branch is worth looking at highlighting as a real consumer of the API. A prototype
is already committed.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message