lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: similarity of two texts
Date Tue, 01 Jun 2004 13:39:38 GMT
On Jun 1, 2004, at 9:24 AM, Grant Ingersoll wrote:
> Hey Eric,

Eri*K*  :)

> What did you do to calc similarity?

I computed the angle between two vectors.  The vectors are obtained 
from IndexReader.getTermFreqVector(docId, "field").

>   I haven't had time, but was thinking of ways to add the ability to 
> get the similarity score (as calculated when doing a search) given a 
> term vector (or just a document id).

It would be quite compute-intensive to do something like this.  This 
could be done through a custom sort as well, if applying it at the 
scoring level doesn't work.  I haven't given any thought to how this 
could work for scoring or sorting before, but does sound quite 

>   Any ideas on how to approach this would be appreciated.  The scoring 
> in Lucene has always been a bit confusing to me, despite looking at 
> the code several times, especially once you get into boolean queries, 
> etc.

No doubt that it is confusing - to me also.  But Explanation is your 


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message