lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll" <GSIng...@syr.edu>
Subject RE: calculate wi = tfi * IDFi for each document.
Date Fri, 03 Jun 2005 11:51:58 GMT
I think the TermFreqVector (reader.getTermVector) has the info you want
per document.  You will need to sort it by frequency to get the top
terms in each document.  It doesn't give you the wi, just tfi, but the
whole score is implied by the fact that you have the top 10 documents, I
think.

-Grant

>>> andrew.boyd@mindspring.com 6/2/2005 3:21:35 PM >>>
Ok.  So if I get 10 Documents back from a search and I want to get the
top 5 weighted terms for each of the 10 documents what API call should I
use?  I'm unable to find the connection between Similarity and a
Document.

I know I'm missing the elephant that must be in the middle of the room.
 Or maybe it's not there.
Is what I'm trying to do do-able?

Thanks,

Andrew

-----Original Message-----
From: Max Pfingsthorn <m.pfingsthorn@hippo.nl>
Sent: Jun 2, 2005 5:33 AM
To: java-user@lucene.apache.org 
Subject: RE: calculate wi = tfi * IDFi for each document.

Hi,

DefaultSimilarity uses exactly this weighting scheme. Makes sense since
it's a pretty standard relevance measure...

Bye!
max

-----Original Message-----
From: Andrew Boyd [mailto:andrew.boyd@mindspring.com] 
Sent: Thursday, June 02, 2005 11:39
To: java-user@lucene.apache.org 
Subject: calculate wi = tfi * IDFi for each document.


If I have search results how can I calculate, using lucene's API,  wi =
tfi * IDFi for each document.

wi    = term weight
tfi    = term frequency in a document
IDFi = inverse document frequency = log(D/dfi)
dfi   = document frequency or number of documents containing term i
D    = number of documents in my search result

Thanks,

Andrew

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org 
For additional commands, e-mail: java-user-help@lucene.apache.org 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org 
For additional commands, e-mail: java-user-help@lucene.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message