systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Dusenberry (JIRA)" <>
Subject [jira] [Commented] (SYSTEMML-1437) Implement and scale Factorization Machines using SystemML
Date Mon, 17 Jul 2017 19:54:00 GMT


Mike Dusenberry commented on SYSTEMML-1437:

[~return_01] Thanks for working on this!  It will be awesome to have FM models in SystemML!
 I've copied the below message from GitHub:

Since FM models can be used for several types of ML tasks, as seen in section III.B, I would
suggest we integrate the FM model in a modular way.  I'm a bit biased, but I think the modular
framework used for neural nets is quite powerful, and I think we could make use of it here.
 Basically, in that framework, each module exposes a `forward/backward/init` API, and modules
can be mixed together.  An FM model exposing this API could then be mixed with an L2 loss
layer in the `nn` library for regression problems, or with `sigmoid` and `log_loss` layers
for binary classification problems.  Basically, we would have an FM module, and then separate
algorithm files for training a regression FM, a binary-classification FM, etc.

If we go that route, I would suggest having a core `FM` module (i.e., `fm.dml`) that has the
`forward/backward/init` API that the neural net layers have.  This module would be for the
core FM model, i.e., the equations in section III.A, as you've been implementing.  The `forward`
function would accept the input data `X`, and the parameters `w0`, `W`, and `V`, and would
return the FM outputs `y`, which is a vector containing a single output for each example.
 The `backward` function would accept the upstream gradients w.r.t. `y` (i.e., `dloss/dy`,
the gradient of the loss function w.r.t. `y`), which is the same shape as `y`, as well as
`X`, `w0`, `W`, and `V`, and would return the gradients of the loss w.r.t. the parameters,
i.e., `dw0`, `dW`, and `dV`.  The `init` function would accept the number of features `n`
(or `d` if we update the notation to be more common, where `d` would not be the same as the
"degree" from the paper) and the factorization dimensionality `k`, and would return initialized
`w0`, `W`, and `V` values.  Thus, the core FM model would be modularized.

Given this modularized FM model in `fm.dml`, we could then have separate files for training
specific types of FM models, such as a regression FM (maybe in `fm_regression.dml`) that has
a `train` (or `fit`) function, and uses the `forward/backward/init` API from `fm.dml` and
`nn/layers/l2_loss.dml` to build a regression FM.  We could then optimize it with the Adam
optimizer.  That file could have a `train` (or `fit`) function, as well as `predict` and `eval`
functions for training, prediction, and accuracy evaluation (and other metrics).  We could
also create another file for binary classification FM models that uses `nn/layers/sigmoid/dml`
and `nn/layers/log_loss.dml`.

If you like this approach, I would suggest looking at the layers in [`nn/layers`](
for the `forward/backward/init` API for inspiration for the core FM module, and the [MNIST
LeNet example](
for the `train/predict/eval` API for the specific types of FM models.

The modularized implementation will allow others to build DML scripts that import the FM models
and train them within a larger DML script that perhaps does preprocessing, etc.  For command
line training of the specific FM models only, you could create separate scripts, similar to
the [train](
and [predict](
LeNet scripts, which just call functions in the main MNIST LeNet file.

> Implement and scale Factorization Machines using SystemML
> ---------------------------------------------------------
>                 Key: SYSTEMML-1437
>                 URL:
>             Project: SystemML
>          Issue Type: Task
>            Reporter: Imran Younus
>              Labels: factorization_machines, gsoc2017, machine_learning, mentor, recommender_system
> Factorization Machines have gained popularity in recent years due to their effectiveness
in recommendation systems. FMs are general predictors which allow to capture interactions
between all features in a features matrix. The feature matrices pertinent to the recommendation
systems are highly sparse. SystemML's highly efficient distributed sparse matrix operations
can be leveraged to implement FMs in a scalable fashion. Given the closed model equation of
FMs, the model parameters can be learned using gradient descent methods.
> This project aims to implement FMs as described in the first paper:
> We'll showcase the scalability of SystemML implementation of FMs by creating an end-to-end
recommendation system.
> Basic understanding of machine learning and optimization techniques is required. Will
need to collaborate with the team to resolve scaling and other systems related issues.
> Rating: Medium
> Mentors:  [~iyounus], [~nakul02]

This message was sent by Atlassian JIRA

View raw message