singa-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r959027 - in /websites/staging/singa/trunk/content: ./ docs/model-config.html docs/program-model.html
Date Mon, 20 Jul 2015 13:47:26 GMT
Author: buildbot
Date: Mon Jul 20 13:47:26 2015
New Revision: 959027

Log:
Staging update by buildbot for singa

Modified:
    websites/staging/singa/trunk/content/   (props changed)
    websites/staging/singa/trunk/content/docs/model-config.html
    websites/staging/singa/trunk/content/docs/program-model.html

Propchange: websites/staging/singa/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Mon Jul 20 13:47:26 2015
@@ -1 +1 @@
-1691875
+1691945

Modified: websites/staging/singa/trunk/content/docs/model-config.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/model-config.html (original)
+++ websites/staging/singa/trunk/content/docs/model-config.html Mon Jul 20 13:47:26 2015
@@ -451,10 +451,6 @@
 <div class="section">
 <h3><a name="NeuralNet"></a>NeuralNet</h3>
 <div class="section">
-<h4><a name="Deep_learning_training"></a>Deep learning training</h4>
-<p>Deep learning is labeled as a feature learning technique, which usually consists
of multiple layers. Each layer is associated a feature transformation function. After going
through all layers, the raw input feature (e.g., pixels of images) would be converted into
a high-level feature that is easier for tasks like classification.</p>
-<p>Training a deep learning model is to find the optimal parameters involved in the
transformation functions that generates good features for specific tasks. The goodness of
a set of parameters is measured by a loss function, e.g., <a class="externalLink" href="https://en.wikipedia.org/wiki/Cross_entropy">Cross-Entropy
Loss</a>. Since the loss functions are usually non-linear and non-convex, it is difficult
to get a closed form solution. Normally, people uses the SGD algorithm which randomly initializes
the parameters and then iteratively update them to reduce the loss.</p></div>
-<div class="section">
 <h4><a name="Uniform_model_neuralnet_representation"></a>Uniform model
(neuralnet) representation</h4>
 <p><img src="../images/model-categorization.png" style="width: 400px" alt="" />
Fig. 1: Deep learning model categorization</img></p>
 <p>Many deep learning models have being proposed. Fig. 1 is a categorization of popular
deep learning models based on the layer connections. The <a class="externalLink" href="https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h">NeuralNet</a>
abstraction of SINGA consists of multiple directly connected layers. This abstraction is able
to represent models from all the three categorizations.</p>

Modified: websites/staging/singa/trunk/content/docs/program-model.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/program-model.html (original)
+++ websites/staging/singa/trunk/content/docs/program-model.html Mon Jul 20 13:47:26 2015
@@ -9,7 +9,7 @@
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
     <meta name="Date-Revision-yyyymmdd" content="20150720" />
     <meta http-equiv="Content-Language" content="en" />
-    <title>Apache SINGA &#x2013; </title>
+    <title>Apache SINGA &#x2013; Programming Model</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
     <link rel="stylesheet" href="../css/site.css" />
     <link rel="stylesheet" href="../css/print.css" media="print" />
@@ -195,7 +195,7 @@
         Apache SINGA</a>
                     <span class="divider">/</span>
       </li>
-        <li class="active "></li>
+        <li class="active ">Programming Model</li>
         
                 
                     
@@ -445,7 +445,81 @@
                         
         <div id="bodyColumn"  class="span10" >
                                   
-            
+            <div class="section">
+<h2><a name="Programming_Model"></a>Programming Model</h2>
+<p>We describe the programming model of SINGA to provide users instructions of implementing
a new model and submitting the training job. The programming model is made almost transparent
to the underlying distributed environment. Hence users do not need to worry much about the
communication and synchronization of nodes, which is discussed in <a href="architecture.html">architecture</a>
in details.</p>
+<div class="section">
+<h3><a name="Deep_learning_training"></a>Deep learning training</h3>
+<p>Deep learning is labeled as a feature learning technique, which usually consists
of multiple layers. Each layer is associated a feature transformation function. After going
through all layers, the raw input feature (e.g., pixels of images) would be converted into
a high-level feature that is easier for tasks like classification.</p>
+<p>Training a deep learning model is to find the optimal parameters involved in the
transformation functions that generates good features for specific tasks. The goodness of
a set of parameters is measured by a loss function, e.g., <a class="externalLink" href="https://en.wikipedia.org/wiki/Cross_entropy">Cross-Entropy
Loss</a>. Since the loss functions are usually non-linear and non-convex, it is difficult
to get a closed form solution. Normally, people uses the SGD algorithm which randomly initializes
the parameters and then iteratively update them to reduce the loss.</p></div>
+<div class="section">
+<h3><a name="Steps_to_submit_a_training_job"></a>Steps to submit a training
job</h3>
+<p>SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters of
deep learning models. For each SGD iteration, there is a <a href="architecture.html">Worker</a>
computing gradients of parameters from the NeuralNet and a <a href="">Updater</a>
updating parameter values based on gradients. SINGA has implemented three algorithms for gradient
calculation, namely Back propagation algorithm for feed-forward models, back-propagation through
time for recurrent neural networks and contrastive divergence for energy models like RBM and
DBM. Variant SGD updaters are also provided, including <a class="externalLink" href="http://arxiv.org/pdf/1212.5701v1.pdf">AdaDelta</a>,
<a class="externalLink" href="http://www.magicbroom.info/Papers/DuchiHaSi10.pdf">AdaGrad</a>,
<a class="externalLink" href="http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf">RMSProp</a>,
<a class="externalLink" href="http://scholar.google.com/citations?view_op=view_citation&amp;hl=en&a
 mp;user=DJ8Ep8YAAAAJ&amp;citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C">Nesterov</a>.</p>
+<p>Consequently, what a user needs to do to submit a training job is</p>
+
+<ol style="list-style-type: decimal">
+  
+<li>
+<p><a href="data.html">Prepare the data</a> for training, validation and
test.</p></li>
+  
+<li>
+<p><a href="layer.html">Implement the new Layers</a> to support specific
feature transformations  required in the new model.</p></li>
+  
+<li>
+<p>Configure the training job including the <a href="architecture.html">cluster
setting</a>  and <a href="model-config.html">model configuration</a></p></li>
+</ol></div>
+<div class="section">
+<h3><a name="Driver_program"></a>Driver program</h3>
+<p>Each training job has a driver program that</p>
+
+<ul>
+  
+<li>
+<p>registers the layers implemented by the user and,</p></li>
+  
+<li>
+<p>starts the <a class="externalLink" href="https://github.com/apache/incubator-singa/blob/master/include/trainer/trainer.h">Trainer</a>
 by providing the job configuration.</p></li>
+</ul>
+<p>An example driver program is like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">#include &quot;singa.h&quot;
+#include &quot;user-layer.h&quot;  // header for user defined layers
+
+DEFINE_int32(job, -1, &quot;Job ID&quot;);  // job ID generated by the SINGA script
+DEFINE_string(workspace, &quot;examples/mnist/&quot;, &quot;workspace of the
training job&quot;);
+DEFINE_bool(resume, false, &quot;resume from checkpoint&quot;);
+
+int main(int argc, char** argv) {
+  google::InitGoogleLogging(argv[0]);
+  gflags::ParseCommandLineFlags(&amp;argc, &amp;argv, true);
+
+  // register all user defined layers in user-layer.h
+  Register(kFooLayer, FooLayer);
+  ...
+
+  JobProto jobConf;
+  // read job configuration from text conf file
+  ReadProtoFromTextFile(&amp;jobConf, FLAGS_workspace + &quot;/job.conf&quot;);
+  Trainer trainer;
+  trainer.Start(FLAGS_job, jobConf, FLAGS_resume);
+}
+</pre></div></div>
+<p>Users can also configure the job in the driver program instead of writing the configuration
file</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">  JobProto jobConf;
+  jobConf.set_job_name(&quot;my singa job&quot;);
+  ... // configure cluster and model
+  Trainer trainer;
+  trainer.Start(FLAGS_job, jobConf, FLAGS_resume);
+</pre></div></div>
+<p>We will provide helper functions to make the configuration easier in the future,
like <a class="externalLink" href="https://github.com/fchollet/keras">keras</a>.</p>
+<p>Compile and link the driver program with singa library to generate an executable,
e.g., with name <tt>mysinga</tt>. To submit the job, just pass the path of the
executable and the workspace to the singa job submission script</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">./bin/singa-run.sh &lt;path
to mysinga&gt; -workspace=&lt;my job workspace&gt;
+</pre></div></div></div></div>
                   </div>
             </div>
           </div>



Mime
View raw message