mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <...@apache.org>
Subject Re: About Matrix Factorization and Vector/Matrix Manipulation
Date Thu, 10 May 2012 15:45:38 GMT
On 10.05.2012 17:33, 冯伟 wrote:
> I want to look at the distribution implementation of matrix factorization
> in Mahout Recommender System. Before I start from
> org.apache.mahout.cf.taste.hadoop.als.RecommenderJob,is there any papers /
> technical materials for reference? It seems that the parameters are learned
> by ALS. Then is there a stochastic gradient descent implementation? I know
> GraphLab of CMU for quite a while since KDDCup 2011,is there any comparison
> between GraphLab's collaborative filtering lib and Mahout's?

Mahout's implementation is based on the following papers:

Large-scale Parallel Collaborative Filtering for the Netflix Prize
http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf

Collaborative Filtering for Implicit Feedback Datasets
http://research.yahoo.com/pub/2433

There is a comparison in the original Graphlab paper which is a little
biased IMHO because it uses an initial hacky version of the ALS
implementation and the experiment is run on a really small dataset.

I still think that Mahout's implementation will be something like 20x
slower than GraphLab mainly due to Hadoop's inability to efficiently run
iterative computations.

Mahout only has a non-distributed SGD implementation of matrix
factorization.

--sebastian

Mime
View raw message