mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject RowSimilarity ?'s
Date Thu, 14 Jul 2011 16:57:06 GMT
Are there docs on RowSimilarity?  Also, has anyone tried it at scale?  I'm seeing some long
running times for a matrix that I don't think is huge (still waiting to hear from colleague
about actual size)  What does the distributed vector similarity get us over just using our
existing distance measures?

Also, would there be interest in a job that is basically the map side of K-Means and simply
outputs the distance between some vector and a list of vectors where the seed vectors fit
in memory? It's similar to RowSimilarity, but it doesn't bother with the co-ocurrence calculation.


-Grant




Mime
View raw message