mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From charlysf <charles.rue...@gmail.com>
Subject Compute similarities for an hudge quantity of data
Date Mon, 06 Jul 2009 23:36:57 GMT

Hello,

I currently working on a small database, I understand that, when I need the
similarity between users, it's basically the compute between all pairs of
users.

It's that ? or it's better ?
If it's that, how can I expect a quick compute for 1 million rows ? 

I don't see what is the difference between asking for the neighborhood, to
compute the similarity for all pairs of users.

Because I thought, something could be interesting :
Make some clusters of users, and only compute the similarity between users
in my cluster.

Thanks
-- 
View this message in context: http://www.nabble.com/Compute-similarities-for-an-hudge-quantity-of-data-tp24364711p24364711.html
Sent from the Mahout User List mailing list archive at Nabble.com.


Mime
View raw message