lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <dave-lucene-u...@tropo.com>
Subject Re: similarity of two texts
Date Wed, 02 Jun 2004 17:39:20 GMT
Terry Steichen wrote:

> Erik,
> 
> Could you expand on this just a wee bit, perhaps with an example of how to
> compute this vector angle?

I'm tempted to write the code to see how it works, but FYI this doc 
seems to nicely explain the concepts:

http://www.la2600.org/talks/files/20040102/Vector_Space_Search_Engine_Theory.pdf
> 
> TIA,
> 
> Terry
> 
> ----- Original Message ----- 
> From: "Erik Hatcher" <erik@ehatchersolutions.com>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Tuesday, June 01, 2004 9:39 AM
> Subject: Re: similarity of two texts
> 
> 
> 
>>On Jun 1, 2004, at 9:24 AM, Grant Ingersoll wrote:
>>
>>>Hey Eric,
>>
>>Eri*K*  :)
>>
>>
>>>What did you do to calc similarity?
>>
>>I computed the angle between two vectors.  The vectors are obtained
>>from IndexReader.getTermFreqVector(docId, "field").
>>
>>
>>>  I haven't had time, but was thinking of ways to add the ability to
>>>get the similarity score (as calculated when doing a search) given a
>>>term vector (or just a document id).
>>
>>It would be quite compute-intensive to do something like this.  This
>>could be done through a custom sort as well, if applying it at the
>>scoring level doesn't work.  I haven't given any thought to how this
>>could work for scoring or sorting before, but does sound quite
>>interesting.
>>
>>
>>>  Any ideas on how to approach this would be appreciated.  The scoring
>>>in Lucene has always been a bit confusing to me, despite looking at
>>>the code several times, especially once you get into boolean queries,
>>>etc.
>>
>>No doubt that it is confusing - to me also.  But Explanation is your
>>friend.
>>
>>Erik
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message