lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Nauli" <andy.na...@utoronto.ca>
Subject computing document similarity
Date Thu, 15 May 2003 14:53:33 GMT
hi lucene developer,

I am planning to use lucene for calculating documents similarity

here's my plan:

I have set of similar documents, and I will index these documents and
extract say top 20 most indexed keywords....

when new documents is available, I want to calculate their similarity using
these extracted keywords....

how feasible is this in lucene ?
what is the best way to do this? I have implemented the most frequent
keyword
extracting part, now what left is performing document similarity
calculation..

thanks
Andy


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message