spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiangrui Meng (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-2085) Apply user-specific regularization instead of uniform regularization in Alternating Least Squares (ALS)
Date Fri, 13 Jun 2014 00:39:01 GMT

     [ https://issues.apache.org/jira/browse/SPARK-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiangrui Meng resolved SPARK-2085.
----------------------------------

          Resolution: Implemented
       Fix Version/s: 1.1.0
    Target Version/s: 1.1.0

PR: https://github.com/apache/spark/pull/1026

> Apply user-specific regularization instead of uniform regularization in Alternating Least
Squares (ALS)
> -------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-2085
>                 URL: https://issues.apache.org/jira/browse/SPARK-2085
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.0.0
>            Reporter: Shuo Xiang
>            Priority: Minor
>             Fix For: 1.1.0
>
>
> The current implementation of ALS takes a single regularization parameter and apply it
on both of the user factors and the product factors. This kind of regularization can be less
effective while user number is significantly larger than the number of products (and vice
versa). For example, if we have 10M users and 1K product, regularization on user factors will
dominate. Following the discussion in [this thread](http://apache-spark-user-list.1001560.n3.nabble.com/possible-bug-in-Spark-s-ALS-implementation-tt2567.html#a2704),
the implementation in this PR will regularize each factor vector by #ratings * lambda.
> Link to PR: https://github.com/apache/spark/pull/1026



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message