lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: ApacheCon next week
Date Tue, 13 Dec 2005 01:29:42 GMT
We use boosts that are calculated based on the frequencies and the 
standard alpha, beta, gamma multipliers from Rochio.  Non-relevant terms 
decrement the frequency.  If a term is <= 0, we remove the term (someone 
has posted a contribution for dealing with negative weights, we just 
haven't adopted it yet).  I am sure there are more things you could do, 
we just haven't investigated too much.  We also give different weights 
to things we think are more important based on our NLP analysis.

Ian Soboroff wrote:

>Grant Ingersoll <> writes:
>>You stole my thunder!  :-)  Was going to post the URL after doing the
>>actual talk, but that's all right.  I will post a few changes I have
>>made on the plane tonight or tomorrow to the website below.
>>Let me know if you have any questions...
>I have one.  I've been thinking about the problem with doing relevance
>feedback in Lucene, and I appreciate seeing your code on getting the
>top terms from a single document.
>However, the real problem for RF and pseudo-RF techniques is forming
>the query.  You can obviously add terms to a query, but how are you
>handling the weighting?  With boosts, or something more sophisticated?
>To unsubscribe, e-mail:
>For additional commands, e-mail:

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
337 Hinds Hall 
Syracuse, NY 13244 
Voice:  315-443-5484 
Fax: 315-443-6886 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message