singa-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wang...@apache.org
Subject svn commit: r1691945 - in /incubator/singa/site/trunk/content/markdown/docs: model-config.md program-model.md
Date Mon, 20 Jul 2015 13:46:50 GMT
Author: wangwei
Date: Mon Jul 20 13:46:49 2015
New Revision: 1691945

URL: http://svn.apache.org/r1691945
Log:
add the programming model page

Modified:
    incubator/singa/site/trunk/content/markdown/docs/model-config.md
    incubator/singa/site/trunk/content/markdown/docs/program-model.md

Modified: incubator/singa/site/trunk/content/markdown/docs/model-config.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/model-config.md?rev=1691945&r1=1691944&r2=1691945&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/model-config.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/model-config.md Mon Jul 20 13:46:49 2015
@@ -14,22 +14,6 @@ has a model.conf file.
 
 ### NeuralNet
 
-#### Deep learning training
-
-Deep learning is labeled as a feature learning technique, which usually
-consists of multiple layers.  Each layer is associated a feature transformation
-function. After going through all layers, the raw input feature (e.g., pixels
-of images) would be converted into a high-level feature that is easier for
-tasks like classification.
-
-Training a deep learning model is to find the optimal parameters involved in
-the transformation functions that generates good features for specific tasks.
-The goodness of a set of parameters is measured by a loss function, e.g.,
-[Cross-Entropy Loss](https://en.wikipedia.org/wiki/Cross_entropy). Since the
-loss functions are usually non-linear and non-convex, it is difficult to get a
-closed form solution. Normally, people uses the SGD algorithm which randomly
-initializes the parameters and then iteratively update them to reduce the loss.
-
 #### Uniform model (neuralnet) representation
 
 <img src = "../images/model-categorization.png" style = "width: 400px"> Fig. 1:

Modified: incubator/singa/site/trunk/content/markdown/docs/program-model.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/program-model.md?rev=1691945&r1=1691944&r2=1691945&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/program-model.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/program-model.md Mon Jul 20 13:46:49
2015
@@ -0,0 +1,101 @@
+## Programming Model
+
+We describe the programming model of SINGA to provide users instructions of
+implementing a new model and submitting the training job. The programming model
+is made almost transparent to the underlying distributed environment. Hence
+users do not need to worry much about the communication and synchronization of
+nodes, which is discussed in [architecture](architecture.html) in details.
+
+### Deep learning training
+
+Deep learning is labeled as a feature learning technique, which usually
+consists of multiple layers.  Each layer is associated a feature transformation
+function. After going through all layers, the raw input feature (e.g., pixels
+of images) would be converted into a high-level feature that is easier for
+tasks like classification.
+
+Training a deep learning model is to find the optimal parameters involved in
+the transformation functions that generates good features for specific tasks.
+The goodness of a set of parameters is measured by a loss function, e.g.,
+[Cross-Entropy Loss](https://en.wikipedia.org/wiki/Cross_entropy). Since the
+loss functions are usually non-linear and non-convex, it is difficult to get a
+closed form solution. Normally, people uses the SGD algorithm which randomly
+initializes the parameters and then iteratively update them to reduce the loss.
+
+
+### Steps to submit a training job
+
+SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters
+of deep learning models.  For each SGD iteration, there is a
+[Worker](architecture.html) computing gradients of parameters from the
+NeuralNet and a [Updater]() updating parameter values based on gradients. SINGA
+has implemented three algorithms for gradient calculation, namely Back
+propagation algorithm for feed-forward models, back-propagation through time
+for recurrent neural networks and contrastive divergence for energy models like
+RBM and DBM. Variant SGD updaters are also provided, including
+[AdaDelta](http://arxiv.org/pdf/1212.5701v1.pdf),
+[AdaGrad](http://www.magicbroom.info/Papers/DuchiHaSi10.pdf),
+[RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf),
+[Nesterov](http://scholar.google.com/citations?view_op=view_citation&amp;hl=en&amp;user=DJ8Ep8YAAAAJ&amp;citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C).
+
+Consequently, what a user needs to do to submit a training job is
+
+  1. [Prepare the data](data.html) for training, validation and test.
+
+  2. [Implement the new Layers](layer.html) to support specific feature transformations
+  required in the new model.
+
+  3. Configure the training job including the [cluster setting](architecture.html)
+  and [model configuration](model-config.html)
+
+### Driver program
+
+Each training job has a driver program that
+
+  * registers the layers implemented by the user and,
+
+  * starts the [Trainer](https://github.com/apache/incubator-singa/blob/master/include/trainer/trainer.h)
+  by providing the job configuration.
+
+An example driver program is like
+
+    #include "singa.h"
+    #include "user-layer.h"  // header for user defined layers
+
+    DEFINE_int32(job, -1, "Job ID");  // job ID generated by the SINGA script
+    DEFINE_string(workspace, "examples/mnist/", "workspace of the training job");
+    DEFINE_bool(resume, false, "resume from checkpoint");
+
+    int main(int argc, char** argv) {
+      google::InitGoogleLogging(argv[0]);
+      gflags::ParseCommandLineFlags(&argc, &argv, true);
+
+      // register all user defined layers in user-layer.h
+      Register(kFooLayer, FooLayer);
+      ...
+
+      JobProto jobConf;
+      // read job configuration from text conf file
+      ReadProtoFromTextFile(&jobConf, FLAGS_workspace + "/job.conf");
+      Trainer trainer;
+      trainer.Start(FLAGS_job, jobConf, FLAGS_resume);
+    }
+
+Users can also configure the job in the driver program instead of writing the
+configuration file
+
+
+      JobProto jobConf;
+      jobConf.set_job_name("my singa job");
+      ... // configure cluster and model
+      Trainer trainer;
+      trainer.Start(FLAGS_job, jobConf, FLAGS_resume);
+
+We will provide helper functions to make the configuration easier in the
+future, like [keras](https://github.com/fchollet/keras).
+
+Compile and link the driver program with singa library to generate an
+executable, e.g., with name `mysinga`. To submit the job, just pass the path of
+the executable and the workspace to the singa job submission script
+
+    ./bin/singa-run.sh <path to mysinga> -workspace=<my job workspace>



Mime
View raw message