lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <>
Subject [jira] Updated: (LUCENE-794) SpanScorer and SimpleSpanFragmenter for Contrib Highlighter
Date Wed, 07 Mar 2007 22:37:24 GMT


Mark Miller updated LUCENE-794:


I have finally come up with a way to ignore fields and so the final test (testFieldSpecificHighlighting)
passes for this. Now all original Highlighter tests pass with this patch. Pass null as the
field to SpanScorer and fields will be ignored during highlighting.

SpanScorer now has the same behavior as the QueryScorer except that actual hits are highlighted.

I have also made a small fix to the SimpleSpanFragmenter.

I am still not sure if it is better to change the Highlighter API or require the kind of nasty
call to reset the SpanScorer between calls to getBestFragments.

I have used a zip file this time. It contains the patch plus an index folder that holds a
new class called TermModifier. This was necessary because I cannot add folders to the patch,
but TermModifier needs to be in the org.apache.lucene.index package. First apply then patch,
then add the index folder to the correct place in the Highlighter contrib section.

Not a lot left to do here. What do you think Mark H? 

- Mark

> SpanScorer and SimpleSpanFragmenter for Contrib Highlighter
> -----------------------------------------------------------
>                 Key: LUCENE-794
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Other
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments:,,,,,,,,,,,,,,,,,,,,, spanhighlighter.patch,
spanhighlighter2.patch, spanhighlighter3.patch,,,,,,
> This patch adds a new Scorer class (SpanQueryScorer) to the Highlighter package that
scores just like QueryScorer, but scores a 0 for Terms that did not cause the Query hit. This
gives 'actual' hit highlighting for the range of SpanQuerys and PhraseQuery. There is also
a new Fragmenter that attempts to fragment without breaking up Spans.
> See for some background.
> There is a dependency on MemoryIndex.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message