lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Chaput <m...@sidefx.com>
Subject Document comparison
Date Fri, 18 Feb 2005 21:26:03 GMT
Is there a simple, efficient way to compute similarity of documents 
indexed with Lucene?

My first, naive idea is to use the entire contents of one document as a 
query to the second document, and use the score as a similarity 
measurement. But I think I'm probably way off base with that.

Can any IR pros set me straight? Thanks very much.

Matt


--
Matt Chaput
Word Monkey
Side Effects Software Inc.

"A goddamned ray of sunshine all the goddamned time"
-- Sparkle Hayter


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message