[ https://issues.apache.org/jira/browse/SPARK6323?page=com.atlassian.jira.plugin.system.issuetabpanels:alltabpanel
]
Debasish Das updated SPARK6323:

Description:
Currently ml.recommendation.ALS is optimized for gram matrix generation which only scales
to modest ranks. The problems that we can solve are in the normal equation/quadratic form:
0.5x'Hx + c'x + g(z)
g(z) can be one of the constraints from Breeze proximal library:
https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/proximal/Proximal.scala
In this PR we will reuse ml.recommendation.ALS design and come up with ml.recommendation.ALM
(Alternating Minimization). Thanks to [~mengxr] recent changes, it's straightforward to do
it now !
ALM will be capable of solving the following problems: min f ( x ) + g ( z )
1. Loss function f ( x ) can be LeastSquareLoss, LoglikelihoodLoss and HingeLoss. Most likely
we will reuse the Gradient interfaces already defined and implement LoglikelihoodLoss
2. Constraints g ( z ) supported are same as above except that we don't support affine + bounds
yet Aeq x = beq , lb <= x <= ub yet. Most likely we don't need that for ML applications
3. For solver we will use breeze.optimize.proximal.NonlinearMinimizer which in turn uses projection
based solver (SPG) or proximal solvers (ADMM) based on convergence speed.
https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/proximal/NonlinearMinimizer.scala
4. The factors will be SparseVector so that we keep shuffle size in check. For example we
will run with 10K ranks but we will force factors to be 100sparse.
This is closely related to Sparse LDA https://issues.apache.org/jira/browse/SPARK5564 with
the difference that we are not using graph representation here.
As we do scaling experiments, we will understand which flow is more suited as ratings get
denser (my understanding is that since we already scaled ALS to 2 billion ratings and since
we will keep sparsity in check, the same 2 billion flow will scale to 10K ranks as well)...
This JIRA is intended to extend the capabilities of ml recommendation to generalized loss
function.
was:
Currently ml.recommendation.ALS is optimized for gram matrix generation which only scales
to modest ranks. The problems that we can solve are in the normal equation/quadratic form:
0.5x'Hx + c'x + g(z)
g(z) can be one of the constraints from Breeze proximal library:
https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/proximal/Proximal.scala
In this PR we will reuse ml.recommendation.ALS design and come up with ml.recommendation.ALM
(Alternating Minimization). Thanks to [~mengxr] recent changes, it's straightforward to do
it now !
ALM will be capable of solving the following problems: min f ( x ) + g ( z )
1. Loss function f ( x ) can be LeastSquareLoss, LoglikelihoodLoss and HingeLoss. Most likely
we will reuse the Gradient interfaces already defined and implement LoglikelihoodLoss
2. Constraints g(z) supported are same as above except that we don't support affine + bounds
yet Aeq x = beq , lb <= x <= ub yet. Most likely we don't need that for ML applications
3. For solver we will use breeze.optimize.proximal.NonlinearMinimizer which in turn uses projection
based solver (SPG) or proximal solvers (ADMM) based on convergence speed.
https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/proximal/NonlinearMinimizer.scala
4. The factors will be SparseVector so that we keep shuffle size in check. For example we
will run with 10K ranks but we will force factors to be 100sparse.
This is closely related to Sparse LDA https://issues.apache.org/jira/browse/SPARK5564 with
the difference that we are not using graph representation here.
As we do scaling experiments, we will understand which flow is more suited as ratings get
denser (my understanding is that since we already scaled ALS to 2 billion ratings and since
we will keep sparsity in check, the same 2 billion flow will scale to 10K ranks as well)...
This JIRA is intended to extend the capabilities of ml recommendation to generalized loss
function.
> Large rank matrix factorization with Nonlinear loss and constraints
> 
>
> Key: SPARK6323
> URL: https://issues.apache.org/jira/browse/SPARK6323
> Project: Spark
> Issue Type: New Feature
> Components: ML, MLlib
> Affects Versions: 1.4.0
> Reporter: Debasish Das
> Fix For: 1.4.0
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> Currently ml.recommendation.ALS is optimized for gram matrix generation which only scales
to modest ranks. The problems that we can solve are in the normal equation/quadratic form:
0.5x'Hx + c'x + g(z)
> g(z) can be one of the constraints from Breeze proximal library:
> https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/proximal/Proximal.scala
> In this PR we will reuse ml.recommendation.ALS design and come up with ml.recommendation.ALM
(Alternating Minimization). Thanks to [~mengxr] recent changes, it's straightforward to do
it now !
> ALM will be capable of solving the following problems: min f ( x ) + g ( z )
> 1. Loss function f ( x ) can be LeastSquareLoss, LoglikelihoodLoss and HingeLoss. Most
likely we will reuse the Gradient interfaces already defined and implement LoglikelihoodLoss
> 2. Constraints g ( z ) supported are same as above except that we don't support affine
+ bounds yet Aeq x = beq , lb <= x <= ub yet. Most likely we don't need that for ML
applications
> 3. For solver we will use breeze.optimize.proximal.NonlinearMinimizer which in turn uses
projection based solver (SPG) or proximal solvers (ADMM) based on convergence speed.
> https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/optimize/proximal/NonlinearMinimizer.scala
> 4. The factors will be SparseVector so that we keep shuffle size in check. For example
we will run with 10K ranks but we will force factors to be 100sparse.
> This is closely related to Sparse LDA https://issues.apache.org/jira/browse/SPARK5564
with the difference that we are not using graph representation here.
> As we do scaling experiments, we will understand which flow is more suited as ratings
get denser (my understanding is that since we already scaled ALS to 2 billion ratings and
since we will keep sparsity in check, the same 2 billion flow will scale to 10K ranks as well)...
> This JIRA is intended to extend the capabilities of ml recommendation to generalized
loss function.

This message was sent by Atlassian JIRA
(v6.3.4#6332)

To unsubscribe, email: issuesunsubscribe@spark.apache.org
For additional commands, email: issueshelp@spark.apache.org
