spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debasish Das <>
Subject matrix factorization cross validation
Date Wed, 29 Oct 2014 18:23:57 GMT

In the current factorization flow, we cross validate on the test dataset
using the RMSE number but there are some other measures which are worth
looking into.

If we consider the problem as a regression problem and the ratings 1-5 are
considered as 5 classes, it is possible to generate a confusion matrix
using MultiClassMetrics.scala

If the ratings are only 0/1 (like from the spotify demo from spark summit)
then it is possible to use Binary Classification Metrices to come up with
the ROC curve...

For topK user/products we should also look into prec@k and pdcg@k as the

Does it make sense to add the multiclass metric and prec@k, pdcg@k in
examples.MovielensALS along with RMSE ?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message