lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: SweetSpotSimiliarity
Date Wed, 24 May 2006 16:33:35 GMT
Marvin Humphrey wrote:
> The only answer seems to be to apply different lengthNorm algos to  
> different fields.

FYI, Nutch uses the following:

All of this is seat-of-the-pants, developed by hand-tuning a few 
queries.  Like code optimization, relevance tuning is better done with 
large amounts of real data.  If you have trusted relevant/non-relevant 
judgements for a representative sample of queries, then you can do a 
much better job of setting these parameters.  Unfortunately, such 
judgements are expensive to generate.

For Web data, one source of relevance judgements is:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message