lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "S.L. (JIRA)" <>
Subject [jira] [Created] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments
Date Tue, 20 Sep 2011 09:33:09 GMT
FastVectorHighlighter: IDF-weighted terms for ordered fragments 

                 Key: LUCENE-3440
             Project: Lucene - Java
          Issue Type: Improvement
          Components: modules/highlighter
    Affects Versions: 3.5
            Reporter: S.L.
            Priority: Minor
             Fix For: 3.5

The FastVectorHighlighter uses for every term found in a fragment an equal weight, which causes
a higher ranking for fragments with a high number of words or, in the worst case, a high number
of very common words than fragments that contains *all* of the terms used in the original

This patch provides ordered fragments with IDF-weighted terms: 

total weight = total weight + IDF for unique term per fragment * boost of query; 

The ranking-formular should be the same, or at least similar, to that one used in

The patch is simple, but it works for us. 

Some ideas:
- A better approach would be moving the whole fragments-scoring into a separate class.
- Switch scoring via parameter 
- Exact phrases should be given a even better score, regardless if a phrase-query was executed
or not
- edismax/dismax-parameters pf, ps and pf^boost should be observed and corresponding fragments
should be ranked higher 

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message