lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: How to do prefix/phrase matching with term-length-sensitive scoring?
Date Thu, 11 Mar 2010 02:02:30 GMT

: Given a list of prefixes, what is the simplest way to match them against
: a text field, giving preference to shorter term matches?

I would suggest using Edge based NGrams, sorting on a numeric field 
containing the "length" of the term.

:  * Term frequency within the field must be ignored when scoring.

You can omit term frequeny info when indexing (sorting will make it 
irrelevent, but no reason 
to waste the space)

:  * Documents and fields are sometimes boosted at index time; norms are
: present.

Hmmm, well that makes the sorting more complicated, but in that case you 
can either include the boost value into your special "length" field to 
have your own magic number for sorting the results, or you use a function 
query based approach to meld the (norm influenced) score with your own 
length field.


-Hoss


Mime
View raw message