lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruce Ritchie" <br...@jivesoftware.com>
Subject RE: TFIDF Implementation
Date Tue, 14 Dec 2004 17:34:46 GMT
Christoph,

I'm not entirely certain if this is what you want, but a while back David Spencer did code
up a 'More Like This' class which can be used for generating similarities between documents.
I can't seem to find this class in the sandbox so I've attached it here. Just repackage and
test.


Regards,

Bruce Ritchie
http://www.jivesoftware.com/   

> -----Original Message-----
> From: Christoph Kiefer [mailto:kiefer@ifi.unizh.ch] 
> Sent: December 14, 2004 11:45 AM
> To: Lucene Users List
> Subject: TFIDF Implementation
> 
> Hi,
> My current task/problem is the following: I need to implement 
> TFIDF document term ranking using Jakarta Lucene to compute a 
> similarity rank between arbitrary documents in the constructed index.
> I saw from the API that there are similar functions already 
> implemented in the class Similarity and DefaultSimilarity but 
> I don't know exactly how to use them. At the time my index 
> has about 25000 (small) documents and there are about 75000 
> terms stored in total.
> Now, my question is simple. Does anybody has done this before 
> or could point me to another location for help?
> 
> Thanks for any help in advance.
> Christoph 
> 
> --
> Christoph Kiefer
> 
> Department of Informatics, University of Zurich
> 
> Office: Uni Irchel 27-K-32
> Phone:  +41 (0) 44 / 635 67 26
> Email:  kiefer@ifi.unizh.ch
> Web:    http://www.ifi.unizh.ch/ddis/christophkiefer.0.html
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


Mime
View raw message