lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Nauli" <>
Subject computing document similarity
Date Thu, 15 May 2003 14:53:33 GMT
hi lucene developer,

I am planning to use lucene for calculating documents similarity

here's my plan:

I have set of similar documents, and I will index these documents and
extract say top 20 most indexed keywords....

when new documents is available, I want to calculate their similarity using
these extracted keywords....

how feasible is this in lucene ?
what is the best way to do this? I have implemented the most frequent
extracting part, now what left is performing document similarity


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message