mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From charlysf <>
Subject Compute similarities for an hudge quantity of data
Date Mon, 06 Jul 2009 23:36:57 GMT


I currently working on a small database, I understand that, when I need the
similarity between users, it's basically the compute between all pairs of

It's that ? or it's better ?
If it's that, how can I expect a quick compute for 1 million rows ? 

I don't see what is the difference between asking for the neighborhood, to
compute the similarity for all pairs of users.

Because I thought, something could be interesting :
Make some clusters of users, and only compute the similarity between users
in my cluster.

View this message in context:
Sent from the Mahout User List mailing list archive at

View raw message