lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Terry Steichen" <te...@net-frame.com>
Subject Re: similarity of two texts
Date Wed, 02 Jun 2004 17:20:37 GMT
Erik,

Could you expand on this just a wee bit, perhaps with an example of how to
compute this vector angle?

TIA,

Terry

----- Original Message ----- 
From: "Erik Hatcher" <erik@ehatchersolutions.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Tuesday, June 01, 2004 9:39 AM
Subject: Re: similarity of two texts


> On Jun 1, 2004, at 9:24 AM, Grant Ingersoll wrote:
> > Hey Eric,
>
> Eri*K*  :)
>
> > What did you do to calc similarity?
>
> I computed the angle between two vectors.  The vectors are obtained
> from IndexReader.getTermFreqVector(docId, "field").
>
> >   I haven't had time, but was thinking of ways to add the ability to
> > get the similarity score (as calculated when doing a search) given a
> > term vector (or just a document id).
>
> It would be quite compute-intensive to do something like this.  This
> could be done through a custom sort as well, if applying it at the
> scoring level doesn't work.  I haven't given any thought to how this
> could work for scoring or sorting before, but does sound quite
> interesting.
>
> >   Any ideas on how to approach this would be appreciated.  The scoring
> > in Lucene has always been a bit confusing to me, despite looking at
> > the code several times, especially once you get into boolean queries,
> > etc.
>
> No doubt that it is confusing - to me also.  But Explanation is your
> friend.
>
> Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message