lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitish Nitish <nit.sys....@gmail.com>
Subject Getting cosine similarity of any given two Lucene 5.1 Documents using latest APIs
Date Sat, 11 Jul 2015 23:30:40 GMT
Hi All,

Greetings,

   Just started with Lucene 5.1 a month ago for my research. I have a set
of documents indexed with term frequencies option enabled during indexing.
For given any two documents, I would like to calculate their tfidf cosine
similarity could you please point me to the right direction?

   Since my Lucene Document has just one indexed field, can I create a
Query out of one of these documents and using some Lucene API have this
newly formed Query run against another document's Field to get their cosine
similarity score? I would prefer to have queryNorm(q), document boost etc
as 1 so as to get purely cosine similarity score. Is there any Lucene API
that I can use for such purposes? Thank you!!

Thanks and Regards,
Nitish

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message