spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From m...@apache.org
Subject git commit: [SPARK-2843][MLLIB] add a section about regularization parameter in ALS
Date Thu, 21 Aug 2014 00:48:07 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-1.1 1af68caf6 -> eba399b3c


[SPARK-2843][MLLIB] add a section about regularization parameter in ALS

atalwalkar srowen

Author: Xiangrui Meng <meng@databricks.com>

Closes #2064 from mengxr/als-doc and squashes the following commits:

b2e20ab [Xiangrui Meng] introduced -> discussed
98abdd7 [Xiangrui Meng] add reference
339bd08 [Xiangrui Meng] add a section about regularization parameter in ALS

(cherry picked from commit e0f946265b9ea5bc48849cf7794c2c03d5e29fba)
Signed-off-by: Xiangrui Meng <meng@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/eba399b3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/eba399b3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/eba399b3

Branch: refs/heads/branch-1.1
Commit: eba399b3c6768f5106cbc17752630fa81d9cdce4
Parents: 1af68ca
Author: Xiangrui Meng <meng@databricks.com>
Authored: Wed Aug 20 17:47:39 2014 -0700
Committer: Xiangrui Meng <meng@databricks.com>
Committed: Wed Aug 20 17:47:58 2014 -0700

----------------------------------------------------------------------
 docs/mllib-collaborative-filtering.md | 11 +++++++++++
 1 file changed, 11 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/eba399b3/docs/mllib-collaborative-filtering.md
----------------------------------------------------------------------
diff --git a/docs/mllib-collaborative-filtering.md b/docs/mllib-collaborative-filtering.md
index ab10b2f..d5c539d 100644
--- a/docs/mllib-collaborative-filtering.md
+++ b/docs/mllib-collaborative-filtering.md
@@ -43,6 +43,17 @@ level of confidence in observed user preferences, rather than explicit
ratings g
 model then tries to find latent factors that can be used to predict the expected preference
of a
 user for an item.
 
+### Scaling of the regularization parameter
+
+Since v1.1, we scale the regularization parameter `lambda` in solving each least squares
problem by
+the number of ratings the user generated in updating user factors,
+or the number of ratings the product received in updating product factors.
+This approach is named "ALS-WR" and discussed in the paper
+"[Large-Scale Parallel Collaborative Filtering for the Netflix Prize](http://dx.doi.org/10.1007/978-3-540-68880-8_32)".
+It makes `lambda` less dependent on the scale of the dataset.
+So we can apply the best parameter learned from a sampled subset to the full dataset
+and expect similar performance.
+
 ## Examples
 
 <div class="codetabs">


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message