systemml-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Boehm (JIRA)" <>
Subject [jira] [Commented] (SYSTEMML-2299) API design of the paramserv function
Date Sun, 13 May 2018 00:41:00 GMT


Matthias Boehm commented on SYSTEMML-2299:

[~Guobao] in general, this is a very good start. I would recommend to make the description
a little more explicit though:
* Hyper Parameters: Separate the model (weight and bias matrices) from the hyper parameters.
Some hyper parameters such as the batchsize and architecture (in form of the given fun1) are
already explicit inputs. Maybe we could pass the other hyper parameters such as learning rate,
momentum, regularization (which mostly affect the optimizer and thus, fun2) via a separate
named list?
* Formatting: Please use the code tag to highlight the function signature and individual input
types. You already give examples, but in order to make it explicit, it would be good to define
the types. For example, add the alternatives for mode, freq, and checkpoint.
* Checkpoint: I don't understand what you mean by rollback recovery here. Maybe we should
start simple and types such as NONE, EPOCH, EPOCH10, to indicate at which frequency we perform
model checkpointing.
* Data Distribution: Another aspect that is currently unspecified is how the data is distributed
to the individual workers. How about adding an additional parameter for that? Examples schemes
are disjoint_contiguous (contiguous splits of X and y), disjoint_round_robin (distributed
X and y rowwise), disjoint_random, overlap_reshuffle (every worker gets all data but reshuffled
in a different random order).
* Optional parameters: Finally, please specify which parameters are optional and their defaults
if not specified.     

> API design of the paramserv function
> ------------------------------------
>                 Key: SYSTEMML-2299
>                 URL:
>             Project: SystemML
>          Issue Type: Sub-task
>            Reporter: LI Guobao
>            Assignee: LI Guobao
>            Priority: Major
> The objective of “paramserv” built-in function is to update an initial or existing
model with configuration. An initial function signature would be _model'=paramserv(model,
X, y, X_val, y_val, upd=fun1, mode=SYNC, freq=EPOCH, agg=fun2, epochs=100, batchsize=64, k=7,
checkpointing=rollback)_. We are interested in providing the model (which will be a struct-like
data structure consisting of the weights, the biases and the hyperparameters), the training
features and labels, the validation features and labels, the batch update function (i.e.,
gradient calculation func), the update strategy (e.g. sync, async, hogwild!, stale-synchronous),
the update frequency (e.g. epoch or mini-batch), the gradient aggregation function, the number
of epoch, the batch size, the degree of parallelism as well as the checkpointing strategy
(e.g. rollback recovery). And the function will return a trained model in struct format.

This message was sent by Atlassian JIRA

View raw message