mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Row Similarity
Date Mon, 18 Aug 2014 17:09:41 GMT
The spark version of itemsimilarity only has LLR as a metric. But what about RSJ? it’s a
pretty simple thing to convert itemsimilarity to rowsimilarity but RSJ has some uses beyond
collaborative filtering. Are some of the other similarity metrics needed?

Specifically text comparison typically weights the terms in the DRM with TF-IDF. Then a user
would apply SIMILARITY_COSINE or SIMILARITY_TANIMOTO. Then again I’ve seen some using LLR
for this too.

 Are some of the other similarity metrics needed for RSJ?
Mime
View raw message