mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy Lyubimov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1365) Weighted ALS-WR iterator for Spark
Date Thu, 20 Feb 2014 08:09:19 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906741#comment-13906741
] 

Dmitriy Lyubimov commented on MAHOUT-1365:
------------------------------------------

Yeah. I am not sure what they are doing there. Last time i looked at it, MLLib did not have
any form of weighed ALS. Now this exapmple seems to include "trainImplicit" which works on
the rating matrix only. In original formulation of implicit feedback problem there were two
values, preference and confidence in such preference. So i am not sure what they do there
since the input is obviously one sparse matrix. 

My generalization of the problem includes formulation where any confidence level could be
attached to either 0 or 1 as a preference, plus baseline. I also assume that model may have
more than one parameter to form confidence which requires fitting as well. (simply speaking
what is "level of consumption" if user clicks on it vs. add-2-cart, if any etc.). Similarly,
there could be difference levels of confidence of ignoring stuff depending on situation. So
0 preferences do not have to always have the baseline confidence either.

> Weighted ALS-WR iterator for Spark
> ----------------------------------
>
>                 Key: MAHOUT-1365
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1365
>             Project: Mahout
>          Issue Type: Task
>            Reporter: Dmitriy Lyubimov
>            Assignee: Dmitriy Lyubimov
>             Fix For: 1.0
>
>         Attachments: distributed-als-with-confidence.pdf
>
>
> Given preference P and confidence C distributed sparse matrices, compute ALS-WR solution
for implicit feedback (Spark Bagel version).
> Following Hu-Koren-Volynsky method (stripping off any concrete methodology to build C
matrix), with parameterized test for convergence.
> The computational scheme is following ALS-WR method (which should be slightly more efficient
for sparser inputs). 
> The best performance will be achieved if non-sparse anomalies prefilitered (eliminated)
(such as an anomalously active user which doesn't represent typical user anyway).
> the work is going here https://github.com/dlyubimov/mahout-commits/tree/dev-0.9.x-scala.
I am porting away our (A1) implementation so there are a few issues associated with that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message