it takes 14 hrs to run the *pseudo*.RecommenderJob with the SVDRecommender. Ran the following command: hadoop jar recommender.jar org.apache.mahout.cf.taste.hadoop.pseudo.RecommenderJob -Dmapred.input.dir=testdata/ratings.csv -Dmapred.output.dir=outputBR --recommenderClassName org.apache.mahout.cf.taste.example.bucky.BuckyRecommender Here BuckyRecommender is SVDRecommender(30,50) it takes 38 minutes if I run the *item*.RecomenderJob with the following command : hadoop jar recommender.jar org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.input.dir=testdata/ratings.csv -Dmapred.output.dir=output item.RecommenderJob is very different from pseudo.RecommenderJob (in terms of the distributed implementation) hence the difference in timings, i guess. On Fri, Nov 19, 2010 at 4:04 PM, Sean Owen wrote: > That result sounds confusing. It should take about the same number of > wall-clock hours either way. I don't see why it would take 14 hours -- that > sounds wrong. If anything it should take 38 / N minutes where N is the > number of recommenders > you ran. > > SVDRecommender is not distributed at all, no. > > On Fri, Nov 19, 2010 at 9:34 PM, Sanjib Kumar Das >wrote: > > > Hi All, > > > > I wanted to run a distributed RecommenderJob with the SVDRecommender > > implementation. > > So i ran the pseudo.RecommenderJob with an > > SVDRecommender(numFeatures=30,trainingSteps=50) on the 1M Movielens > > data(6040 users). So this generated 10 recommendations for each of the > 6040 > > users but took 14 hours to do so! My hadoop cluster had 12 m/cs. So i > guess > > it just ran multiple instances of the non-distributed SVD implementation > > and > > each of these instances did the same thing again and again. So unless the > > implementation of the recommender is distributed, we dont get any special > > benefit with the pseudo.RecommenderJob. > > > > But the item.RecommenderJob does the same 10 recommendations each for the > > 6040 users in 38 minutes. This is because it has an underlying > distributed > > implementation. > > > > So my doubt is do we have a distributed SVDRecommender implementation? If > > not, how should i go about writing one? Can I use the new LanczosSolver > to > > achieve this? > > > > Thanks, > > Sanjib > > >