mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjib Kumar Das <>
Subject Need for a distributed SVDRecommender
Date Fri, 19 Nov 2010 21:34:03 GMT
Hi All,

I wanted to run a distributed RecommenderJob with the SVDRecommender
So i ran the pseudo.RecommenderJob with an
SVDRecommender(numFeatures=30,trainingSteps=50) on the 1M Movielens
data(6040 users). So this generated 10 recommendations for each of the 6040
users but took 14 hours to do so! My hadoop cluster had 12 m/cs. So i guess
it just ran multiple instances of the non-distributed SVD implementation and
each of these instances did the same thing again and again. So unless the
implementation of the recommender is distributed, we dont get any special
benefit with the pseudo.RecommenderJob.

But the item.RecommenderJob does the same 10 recommendations each for the
6040 users in 38 minutes. This is because it has an underlying distributed

So my doubt is do we have a distributed SVDRecommender implementation? If
not, how should i go about writing one? Can I use the new LanczosSolver to
achieve this?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message