singa-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wang...@apache.org
Subject svn commit: r1831260 [3/8] - in /incubator/singa/site/trunk: en/ en/_sources/docs/ en/_static/css/ en/_static/js/ en/community/ en/develop/ en/docs/ en/docs/model_zoo/ en/docs/model_zoo/caffe/ en/docs/model_zoo/char-rnn/ en/docs/model_zoo/cifar10/ en/d...
Date Wed, 09 May 2018 15:25:27 GMT
Modified: incubator/singa/site/trunk/en/docs/layer.html
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/en/docs/layer.html?rev=1831260&r1=1831259&r2=1831260&view=diff
==============================================================================
--- incubator/singa/site/trunk/en/docs/layer.html (original)
+++ incubator/singa/site/trunk/en/docs/layer.html Wed May  9 15:25:26 2018
@@ -97,7 +97,7 @@
 <li class="toctree-l2"><a class="reference internal" href="device.html">Device</a></li>
 <li class="toctree-l2"><a class="reference internal" href="tensor.html">Tensor</a></li>
 <li class="toctree-l2 current"><a class="current reference internal" href="#">Layer</a><ul>
-<li class="toctree-l3"><a class="reference internal" href="#python-api">Python API</a></li>
+<li class="toctree-l3"><a class="reference internal" href="#module-singa.layer">Python API</a></li>
 <li class="toctree-l3"><a class="reference internal" href="#cpp-api">CPP API</a></li>
 </ul>
 </li>
@@ -194,8 +194,1116 @@
             
   <div class="section" id="layer">
 <h1>Layer<a class="headerlink" href="#layer" title="Permalink to this headline">¶</a></h1>
-<div class="section" id="python-api">
-<h2>Python API<a class="headerlink" href="#python-api" title="Permalink to this headline">¶</a></h2>
+<div class="section" id="module-singa.layer">
+<span id="python-api"></span><h2>Python API<a class="headerlink" href="#module-singa.layer" title="Permalink to this headline">¶</a></h2>
+<p>Python layers wrap the C++ layers to provide simpler construction APIs.</p>
+<p>Example usages:</p>
+<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">layer</span>
+<span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">tensor</span>
+<span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">device</span>
+
+<span class="n">layer</span><span class="o">.</span><span class="n">engine</span> <span class="o">=</span> <span class="s1">&#39;cudnn&#39;</span>  <span class="c1"># to use cudnn layers</span>
+<span class="n">dev</span> <span class="o">=</span> <span class="n">device</span><span class="o">.</span><span class="n">create_cuda_gpu</span><span class="p">()</span>
+
+<span class="c1"># create a convolution layer</span>
+<span class="n">conv</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">Conv2D</span><span class="p">(</span><span class="s1">&#39;conv&#39;</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">pad</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">input_sample_shape</span><span class="o">=</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">32</span><span class="p">))</span>
+
+<span class="c1"># init param values</span>
+<span class="n">w</span><span class="p">,</span> <span class="n">b</span> <span class="o">=</span> <span class="n">conv</span><span class="o">.</span><span class="n">param_values</span><span class="p">()</span>
+<span class="n">w</span><span class="o">.</span><span class="n">guassian</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">)</span>
+<span class="n">b</span><span class="o">.</span><span class="n">set_value</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
+<span class="n">conv</span><span class="o">.</span><span class="n">to_device</span><span class="p">(</span><span class="n">dev</span><span class="p">)</span>  <span class="c1"># move the layer data onto a CudaGPU device</span>
+
+<span class="n">x</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">Tensor</span><span class="p">((</span><span class="mi">3</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">32</span><span class="p">),</span> <span class="n">dev</span><span class="p">)</span>
+<span class="n">x</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
+<span class="n">y</span> <span class="o">=</span> <span class="n">conv</span><span class="o">.</span><span class="n">foward</span><span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span>
+
+<span class="n">dy</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">Tensor</span><span class="p">()</span>
+<span class="n">dy</span><span class="o">.</span><span class="n">reset_like</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
+<span class="n">dy</span><span class="o">.</span><span class="n">set_value</span><span class="p">(</span><span class="mf">0.1</span><span class="p">)</span>
+<span class="c1"># dp is a list of tensors for parameter gradients</span>
+<span class="n">dx</span><span class="p">,</span> <span class="n">dp</span> <span class="o">=</span> <span class="n">conv</span><span class="o">.</span><span class="n">backward</span><span class="p">(</span><span class="n">kTrain</span><span class="p">,</span> <span class="n">dy</span><span class="p">)</span>
+</pre></div>
+</div>
+<dl class="data">
+<dt id="singa.layer.engine">
+<code class="descclassname">singa.layer.</code><code class="descname">engine</code><em class="property"> = 'cudnn'</em><a class="headerlink" href="#singa.layer.engine" title="Permalink to this definition">¶</a></dt>
+<dd><p>engine is the prefix of layer identifier.</p>
+<p>The value could be one of [<strong>‘cudnn’, ‘singacpp’, ‘singacuda’, ‘singacl’</strong>], for
+layers implemented using the cudnn library, Cpp, Cuda and OpenCL respectively.
+For example, CudnnConvolution layer is identified by ‘cudnn_convolution’;
+‘singacpp_convolution’ is for Convolution layer;
+Some layers’ implementation use only Tensor functions, thererfore they are
+transparent to the underlying devices. For threse layers, they would have
+multiple identifiers, e.g., singacpp_dropout, singacuda_dropout and
+singacl_dropout are all for the Dropout layer. In addition, it has an extra
+identifier ‘singa’, i.e. ‘singa_dropout’ also stands for the Dropout layer.</p>
+<p>engine is case insensitive. Each python layer would create the correct specific
+layer using the engine attribute.</p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Layer">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Layer</code><span class="sig-paren">(</span><em>name</em>, <em>conf=None</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">object</span></code></p>
+<p>Base Python layer class.</p>
+<dl class="docutils">
+<dt>Typically, the life cycle of a layer instance includes:</dt>
+<dd><ol class="first last arabic simple">
+<li>construct layer without input_sample_shapes, goto 2;
+construct layer with input_sample_shapes, goto 3;</li>
+<li>call setup to create the parameters and setup other meta fields</li>
+<li>call forward or access layer members</li>
+<li>call backward and get parameters for update</li>
+</ol>
+</dd>
+</dl>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>name</strong> (<em>str</em>) – layer name</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.Layer.setup">
+<code class="descname">setup</code><span class="sig-paren">(</span><em>in_shapes</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.setup" title="Permalink to this definition">¶</a></dt>
+<dd><p>Call the C++ setup function to create params and set some meta data.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>in_shapes</strong> – if the layer accepts a single input Tensor, in_shapes is
+a single tuple specifying the inpute Tensor shape; if the layer
+accepts multiple input Tensor (e.g., the concatenation layer),
+in_shapes is a tuple of tuples, each for one input Tensor</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.caffe_layer">
+<code class="descname">caffe_layer</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.caffe_layer" title="Permalink to this definition">¶</a></dt>
+<dd><p>Create a singa layer based on caffe layer configuration.</p>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.param_names">
+<code class="descname">param_names</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.param_names" title="Permalink to this definition">¶</a></dt>
+<dd><table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a list of strings, one for the name of one parameter Tensor</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.param_values">
+<code class="descname">param_values</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.param_values" title="Permalink to this definition">¶</a></dt>
+<dd><p>Return param value tensors.</p>
+<p>Parameter tensors are not stored as layer members because cpp Tensor
+could be moved onto diff devices due to the change of layer device,
+which would result in inconsistency.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a list of tensors, one for each paramter</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Forward propagate through this layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – True (kTrain) for training (kEval); False for evaluating;
+other values for furture use.</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a><em> or </em><em>list&lt;Tensor&gt;</em>) – an input tensor if the layer is
+connected from a single layer; a list of tensors if the layer
+is connected from multiple layers.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a tensor if the layer is connected to a single layer; a list of
+tensors if the layer is connected to multiple layers;</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>flag</em>, <em>dy</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Backward propagate gradients through this layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) – for future use.</li>
+<li><strong>dy</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a><em> or </em><em>list&lt;Tensor&gt;</em>) – the gradient tensor(s) y w.r.t the
+objective loss</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">&lt;dx, &lt;dp1, dp2..&gt;&gt;, dx is a (set of) tensor(s) for the gradient of x
+, dpi is the gradient of the i-th parameter</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.to_device">
+<code class="descname">to_device</code><span class="sig-paren">(</span><em>device</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.to_device" title="Permalink to this definition">¶</a></dt>
+<dd><p>Move layer state tensors onto the given device.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>device</strong> – swig converted device, created using singa.device</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.as_type">
+<code class="descname">as_type</code><span class="sig-paren">(</span><em>dtype</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.as_type" title="Permalink to this definition">¶</a></dt>
+<dd></dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Dummy">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Dummy</code><span class="sig-paren">(</span><em>name</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dummy" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>A dummy layer that does nothing but just forwards/backwards the data
+(the input/output is a single tensor).</p>
+<dl class="method">
+<dt id="singa.layer.Dummy.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dummy.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Dummy.setup">
+<code class="descname">setup</code><span class="sig-paren">(</span><em>input_sample_shape</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dummy.setup" title="Permalink to this definition">¶</a></dt>
+<dd><p>Call the C++ setup function to create params and set some meta data.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>in_shapes</strong> – if the layer accepts a single input Tensor, in_shapes is
+a single tuple specifying the inpute Tensor shape; if the layer
+accepts multiple input Tensor (e.g., the concatenation layer),
+in_shapes is a tuple of tuples, each for one input Tensor</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Dummy.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dummy.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Return the input x</p>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Dummy.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>falg</em>, <em>dy</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dummy.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Return dy, []</p>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Conv2D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Conv2D</code><span class="sig-paren">(</span><em>name</em>, <em>nb_kernels</em>, <em>kernel=3</em>, <em>stride=1</em>, <em>border_mode='same'</em>, <em>cudnn_prefer='fastest'</em>, <em>workspace_byte_limit=1024</em>, <em>data_format='NCHW'</em>, <em>use_bias=True</em>, <em>W_specs=None</em>, <em>b_specs=None</em>, <em>pad=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Conv2D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Construct a layer for 2D convolution.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>nb_kernels</strong> (<em>int</em>) – num of the channels (kernels) of the input Tensor</li>
+<li><strong>kernel</strong> – an integer or a pair of integers for kernel height and width</li>
+<li><strong>stride</strong> – an integer or a pair of integers for stride height and width</li>
+<li><strong>border_mode</strong> (<em>string</em>) – padding mode, case in-sensitive,
+‘valid’ -&gt; padding is 0 for height and width
+‘same’ -&gt; padding is half of the kernel (floor), the kernel must be
+odd number.</li>
+<li><strong>cudnn_prefer</strong> (<em>string</em>) – the preferred algorithm for cudnn convolution
+which could be ‘fastest’, ‘autotune’, ‘limited_workspace’ and
+‘no_workspace’</li>
+<li><strong>workspace_byte_limit</strong> (<em>int</em>) – max workspace size in MB (default is 512MB)</li>
+<li><strong>data_format</strong> (<em>string</em>) – either ‘NCHW’ or ‘NHWC’</li>
+<li><strong>use_bias</strong> (<em>bool</em>) – True or False</li>
+<li><strong>pad</strong> – an integer or a pair of integers for padding height and width</li>
+<li><strong>W_specs</strong> (<em>dict</em>) – used to specify the weight matrix specs, fields
+include,
+‘name’ for parameter name
+‘lr_mult’ for learning rate multiplier
+‘decay_mult’ for weight decay multiplier
+‘init’ for init method, which could be ‘gaussian’, ‘uniform’,
+‘xavier’ and ‘’
+‘std’, ‘mean’, ‘high’, ‘low’ for corresponding init methods
+TODO(wangwei) ‘clamp’ for gradient constraint, value is scalar
+‘regularizer’ for regularization, currently support ‘l2’</li>
+<li><strong>b_specs</strong> (<em>dict</em>) – hyper-parameters for bias vector, similar as W_specs</li>
+<li><strong>name</strong> (<em>string</em>) – layer name.</li>
+<li><strong>input_sample_shape</strong> – 3d tuple for the shape of the input Tensor
+without the batchsize, e.g., (channel, height, width) or
+(height, width, channel)</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.Conv2D.setup">
+<code class="descname">setup</code><span class="sig-paren">(</span><em>in_shape</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Conv2D.setup" title="Permalink to this definition">¶</a></dt>
+<dd><p>Set up the kernel, stride and padding; then call the C++ setup
+function to create params and set some meta data.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>is a tuple of int for the input sample shape</strong> (<em>in_shapes</em>) – </td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Conv1D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Conv1D</code><span class="sig-paren">(</span><em>name</em>, <em>nb_kernels</em>, <em>kernel=3</em>, <em>stride=1</em>, <em>border_mode='same'</em>, <em>cudnn_prefer='fastest'</em>, <em>workspace_byte_limit=1024</em>, <em>use_bias=True</em>, <em>W_specs={'init': 'Xavier'}</em>, <em>b_specs={'init': 'Constant'</em>, <em>'value': 0}</em>, <em>pad=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Conv1D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Conv2D" title="singa.layer.Conv2D"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Conv2D</span></code></a></p>
+<p>Construct a layer for 1D convolution.</p>
+<p>Most of the args are the same as those for Conv2D except the kernel,
+stride, pad, which is a scalar instead of a tuple.
+input_sample_shape is a tuple with a single value for the input feature
+length</p>
+<dl class="method">
+<dt id="singa.layer.Conv1D.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Conv1D.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Pooling2D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Pooling2D</code><span class="sig-paren">(</span><em>name</em>, <em>mode</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Pooling2D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>2D pooling layer providing max/avg pooling.</p>
+<p>All args are the same as those for Conv2D, except the following one</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>mode</strong> – pooling type, model_pb2.PoolingConf.MAX or
+model_pb2.PoolingConf.AVE</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.Pooling2D.setup">
+<code class="descname">setup</code><span class="sig-paren">(</span><em>in_shape</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Pooling2D.setup" title="Permalink to this definition">¶</a></dt>
+<dd><p>Set up the kernel, stride and padding; then call the C++ setup
+function to create params and set some meta data.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>is a tuple of int for the input sample shape</strong> (<em>in_shapes</em>) – </td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.MaxPooling2D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">MaxPooling2D</code><span class="sig-paren">(</span><em>name</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.MaxPooling2D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Pooling2D" title="singa.layer.Pooling2D"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Pooling2D</span></code></a></p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.AvgPooling2D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">AvgPooling2D</code><span class="sig-paren">(</span><em>name</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.AvgPooling2D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Pooling2D" title="singa.layer.Pooling2D"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Pooling2D</span></code></a></p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.MaxPooling1D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">MaxPooling1D</code><span class="sig-paren">(</span><em>name</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.MaxPooling1D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.MaxPooling2D" title="singa.layer.MaxPooling2D"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.MaxPooling2D</span></code></a></p>
+<dl class="method">
+<dt id="singa.layer.MaxPooling1D.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.MaxPooling1D.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.AvgPooling1D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">AvgPooling1D</code><span class="sig-paren">(</span><em>name</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.AvgPooling1D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.AvgPooling2D" title="singa.layer.AvgPooling2D"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.AvgPooling2D</span></code></a></p>
+<dl class="method">
+<dt id="singa.layer.AvgPooling1D.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.AvgPooling1D.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.BatchNormalization">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">BatchNormalization</code><span class="sig-paren">(</span><em>name</em>, <em>momentum=0.9</em>, <em>beta_specs=None</em>, <em>gamma_specs=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.BatchNormalization" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Batch-normalization.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>momentum</strong> (<em>float</em>) – for running average mean and variance.</li>
+<li><strong>beta_specs</strong> (<em>dict</em>) – dictionary includes the fields for the beta
+param:
+‘name’ for parameter name
+‘lr_mult’ for learning rate multiplier
+‘decay_mult’ for weight decay multiplier
+‘init’ for init method, which could be ‘gaussian’, ‘uniform’,
+‘xavier’ and ‘’
+‘std’, ‘mean’, ‘high’, ‘low’ for corresponding init methods
+‘clamp’ for gradient constraint, value is scalar
+‘regularizer’ for regularization, currently support ‘l2’</li>
+<li><strong>gamma_specs</strong> (<em>dict</em>) – similar to beta_specs, but for the gamma param.</li>
+<li><strong>name</strong> (<em>string</em>) – layer name</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) – with at least one integer</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.L2Norm">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">L2Norm</code><span class="sig-paren">(</span><em>name</em>, <em>input_sample_shape</em>, <em>epsilon=1e-08</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.L2Norm" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Normalize each sample to have L2 norm = 1</p>
+<dl class="method">
+<dt id="singa.layer.L2Norm.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.L2Norm.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.L2Norm.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>is_train</em>, <em>x</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.L2Norm.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Forward propagate through this layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – True (kTrain) for training (kEval); False for evaluating;
+other values for furture use.</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a><em> or </em><em>list&lt;Tensor&gt;</em>) – an input tensor if the layer is
+connected from a single layer; a list of tensors if the layer
+is connected from multiple layers.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a tensor if the layer is connected to a single layer; a list of
+tensors if the layer is connected to multiple layers;</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.L2Norm.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>is_train</em>, <em>dy</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.L2Norm.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Backward propagate gradients through this layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) – for future use.</li>
+<li><strong>dy</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a><em> or </em><em>list&lt;Tensor&gt;</em>) – the gradient tensor(s) y w.r.t the
+objective loss</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">&lt;dx, &lt;dp1, dp2..&gt;&gt;, dx is a (set of) tensor(s) for the gradient of x
+, dpi is the gradient of the i-th parameter</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.LRN">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">LRN</code><span class="sig-paren">(</span><em>name</em>, <em>size=5</em>, <em>alpha=1</em>, <em>beta=0.75</em>, <em>mode='cross_channel'</em>, <em>k=1</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.LRN" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Local response normalization.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>size</strong> (<em>int</em>) – # of channels to be crossed
+normalization.</li>
+<li><strong>mode</strong> (<em>string</em>) – ‘cross_channel’</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) – 3d tuple, (channel, height, width)</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Dense">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Dense</code><span class="sig-paren">(</span><em>name</em>, <em>num_output</em>, <em>use_bias=True</em>, <em>W_specs=None</em>, <em>b_specs=None</em>, <em>W_transpose=False</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dense" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Apply linear/affine transformation, also called inner-product or
+fully connected layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>num_output</strong> (<em>int</em>) – output feature length.</li>
+<li><strong>use_bias</strong> (<em>bool</em>) – add a bias vector or not to the transformed feature</li>
+<li><strong>W_specs</strong> (<em>dict</em>) – specs for the weight matrix
+‘name’ for parameter name
+‘lr_mult’ for learning rate multiplier
+‘decay_mult’ for weight decay multiplier
+‘init’ for init method, which could be ‘gaussian’, ‘uniform’,
+‘xavier’ and ‘’
+‘std’, ‘mean’, ‘high’, ‘low’ for corresponding init methods
+‘clamp’ for gradient constraint, value is scalar
+‘regularizer’ for regularization, currently support ‘l2’</li>
+<li><strong>b_specs</strong> (<em>dict</em>) – specs for the bias vector, same fields as W_specs.</li>
+<li><strong>W_transpose</strong> (<em>bool</em>) – if true, output=x*W.T+b;</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) – input feature length</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Dropout">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Dropout</code><span class="sig-paren">(</span><em>name</em>, <em>p=0.5</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dropout" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Droput layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>p</strong> (<em>float</em>) – probability for dropping out the element, i.e., set to 0</li>
+<li><strong>name</strong> (<em>string</em>) – layer name</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Activation">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Activation</code><span class="sig-paren">(</span><em>name</em>, <em>mode='relu'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Activation" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Activation layers.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>name</strong> (<em>string</em>) – layer name</li>
+<li><strong>mode</strong> (<em>string</em>) – ‘relu’, ‘sigmoid’, or ‘tanh’</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) – shape of a single sample</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Softmax">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Softmax</code><span class="sig-paren">(</span><em>name</em>, <em>axis=1</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Softmax" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Apply softmax.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>axis</strong> (<em>int</em>) – reshape the input as a matrix with the dimension
+[0,axis) as the row, the [axis, -1) as the column.</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) – shape of a single sample</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Flatten">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Flatten</code><span class="sig-paren">(</span><em>name</em>, <em>axis=1</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Flatten" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Reshape the input tensor into a matrix.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>axis</strong> (<em>int</em>) – reshape the input as a matrix with the dimension
+[0,axis) as the row, the [axis, -1) as the column.</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) – shape for a single sample</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Merge">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Merge</code><span class="sig-paren">(</span><em>name</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Merge" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Sum all input tensors.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>input_sample_shape</strong> – sample shape of the input. The sample shape of all
+inputs should be the same.</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.Merge.setup">
+<code class="descname">setup</code><span class="sig-paren">(</span><em>in_shape</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Merge.setup" title="Permalink to this definition">¶</a></dt>
+<dd><p>Call the C++ setup function to create params and set some meta data.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>in_shapes</strong> – if the layer accepts a single input Tensor, in_shapes is
+a single tuple specifying the inpute Tensor shape; if the layer
+accepts multiple input Tensor (e.g., the concatenation layer),
+in_shapes is a tuple of tuples, each for one input Tensor</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Merge.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Merge.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Merge.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>inputs</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Merge.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Merge all input tensors by summation.</p>
+<p>TODO(wangwei) do element-wise merge operations, e.g., avg, count
+:param flag: not used.
+:param inputs: a list of tensors
+:type inputs: list</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">A single tensor as the sum of all input tensors</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Merge.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>flag</em>, <em>grad</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Merge.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Replicate the grad for each input source layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>grad</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – </td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body">A list of replicated grad, one per source layer</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Split">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Split</code><span class="sig-paren">(</span><em>name</em>, <em>num_output</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Split" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Replicate the input tensor.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>num_output</strong> (<em>int</em>) – number of output tensors to generate.</li>
+<li><strong>input_sample_shape</strong> – includes a single integer for the input sample
+feature size.</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.Split.setup">
+<code class="descname">setup</code><span class="sig-paren">(</span><em>in_shape</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Split.setup" title="Permalink to this definition">¶</a></dt>
+<dd><p>Call the C++ setup function to create params and set some meta data.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>in_shapes</strong> – if the layer accepts a single input Tensor, in_shapes is
+a single tuple specifying the inpute Tensor shape; if the layer
+accepts multiple input Tensor (e.g., the concatenation layer),
+in_shapes is a tuple of tuples, each for one input Tensor</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Split.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Split.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Split.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>input</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Split.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Replicate the input tensor into mutiple tensors.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – not used</li>
+<li><strong>input</strong> – a single input tensor</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a list a output tensor (each one is a copy of the input)</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Split.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>flag</em>, <em>grads</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Split.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Sum all grad tensors to generate a single output tensor.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>grads</strong> (<em>list of Tensor</em>) – </td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body">a single tensor as the sum of all grads</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Concat">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Concat</code><span class="sig-paren">(</span><em>name</em>, <em>axis</em>, <em>input_sample_shapes=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Concat" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Concatenate tensors vertically (axis = 0) or horizontally (axis = 1).</p>
+<p>Currently, only support tensors with 2 dimensions.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>axis</strong> (<em>int</em>) – 0 for concat row; 1 for concat columns;</li>
+<li><strong>input_sample_shapes</strong> – a list of sample shape tuples, one per input tensor</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.Concat.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>inputs</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Concat.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Concatenate all input tensors.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – same as Layer::forward()</li>
+<li><strong>input</strong> – a list of tensors</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a single concatenated tensor</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Concat.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>flag</em>, <em>dy</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Concat.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Backward propagate gradients through this layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – same as Layer::backward()</li>
+<li><strong>dy</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the gradient tensors of y w.r.t objective loss</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"><dl class="docutils">
+<dt>&lt;dx, []&gt;, dx is a list tensors for the gradient of the inputs; []</dt>
+<dd><p class="first last">is an empty list.</p>
+</dd>
+</dl>
+</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Slice">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Slice</code><span class="sig-paren">(</span><em>name</em>, <em>axis</em>, <em>slice_point</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Slice" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Slice the input tensor into multiple sub-tensors vertially (axis=0) or
+horizontally (axis=1).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>axis</strong> (<em>int</em>) – 0 for slice rows; 1 for slice columns;</li>
+<li><strong>slice_point</strong> (<em>list</em>) – positions along the axis to do slice; there are n-1
+points for n sub-tensors;</li>
+<li><strong>input_sample_shape</strong> – input tensor sample shape</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.Slice.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Slice.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Slice.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Slice.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Slice the input tensor on the given axis.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – same as Layer::forward()</li>
+<li><strong>x</strong> – a single input tensor</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a list a output tensor</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Slice.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>flag</em>, <em>grads</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Slice.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Concate all grad tensors to generate a single output tensor</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – same as Layer::backward()</li>
+<li><strong>grads</strong> – a list of tensors, one for the gradient of one sliced tensor</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"><dl class="docutils">
+<dt>a single tensor for the gradient of the original user, and an empty</dt>
+<dd><p class="first last">list.</p>
+</dd>
+</dl>
+</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.RNN">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">RNN</code><span class="sig-paren">(</span><em>name</em>, <em>hidden_size</em>, <em>rnn_mode='lstm'</em>, <em>dropout=0.0</em>, <em>num_stacks=1</em>, <em>input_mode='linear'</em>, <em>bidirectional=False</em>, <em>param_specs=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.RNN" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Recurrent layer with 4 types of units, namely lstm, gru, tanh and relu.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>hidden_size</strong> – hidden feature size, the same for all stacks of layers.</li>
+<li><strong>rnn_mode</strong> – decides the rnn unit, which could be one of ‘lstm’, ‘gru’,
+‘tanh’ and ‘relu’, refer to cudnn manual for each mode.</li>
+<li><strong>num_stacks</strong> – num of stacks of rnn layers. It is different to the
+unrolling seqence length.</li>
+<li><strong>input_mode</strong> – ‘linear’ convert the input feature x by by a linear
+transformation to get a feature vector of size hidden_size;
+‘skip’ does nothing but requires the input feature size equals
+hidden_size</li>
+<li><strong>bidirection</strong> – True for bidirectional RNN</li>
+<li><strong>param_specs</strong> – config for initializing the RNN parameters.</li>
+<li><strong>input_sample_shape</strong> – includes a single integer for the input sample
+feature size.</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.RNN.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>inputs</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.RNN.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Forward inputs through the RNN.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – True(kTrain) for training; False(kEval) for evaluation;
+others values for future use.</li>
+<li><strong>&lt;x1</strong><strong>, </strong><strong>x2</strong><strong>,</strong><strong>..xn</strong><strong>, </strong><strong>hx</strong><strong>, </strong><strong>cx&gt;</strong><strong>, </strong><strong>where xi is the input tensor for the</strong> (<em>inputs</em><em>,</em>) – i-th position, its shape is (batch_size, input_feature_length);
+the batch_size of xi must &gt;= that of xi+1; hx is the initial
+hidden state of shape (num_stacks * bidirection?2:1, batch_size,
+hidden_size). cx is the initial cell state tensor of the same
+shape as hy. cx is valid for only lstm. For other RNNs there is
+no cx. Both hx and cx could be dummy tensors without shape and
+data.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"><dl class="docutils">
+<dt>&lt;y1, y2, … yn, hy, cy&gt;, where yi is the output tensor for the i-th</dt>
+<dd><p class="first last">position, its shape is (batch_size,
+hidden_size * bidirection?2:1). hy is the final hidden state
+tensor. cx is the final cell state tensor. cx is only used for
+lstm.</p>
+</dd>
+</dl>
+</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.RNN.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>flag</em>, <em>grad</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.RNN.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Backward gradients through the RNN.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>for future use.</strong> (<em>flag</em><em>,</em>) – </li>
+<li><strong>&lt;dy1</strong><strong>, </strong><strong>dy2</strong><strong>,</strong><strong>..dyn</strong><strong>, </strong><strong>dhy</strong><strong>, </strong><strong>dcy&gt;</strong><strong>, </strong><strong>where dyi is the gradient for the</strong> (<em>grad</em><em>,</em>) – </li>
+<li><strong>output</strong><strong>, </strong><strong>its shape is</strong><strong> (</strong><strong>batch_size</strong><strong>, </strong><strong>hidden_size*bidirection?2</strong> (<em>i-th</em>) – 1);
+dhy is the gradient for the final hidden state, its shape is
+(num_stacks * bidirection?2:1, batch_size,
+hidden_size). dcy is the gradient for the final cell state.
+cx is valid only for lstm. For other RNNs there is
+no cx. Both dhy and dcy could be dummy tensors without shape and
+data.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"><dl class="docutils">
+<dt>&lt;dx1, dx2, … dxn, dhx, dcx&gt;, where dxi is the gradient tensor for</dt>
+<dd><p class="first last">the i-th input, its shape is (batch_size,
+input_feature_length). dhx is the gradient for the initial
+hidden state. dcx is the gradient for the initial cell state,
+which is valid only for lstm.</p>
+</dd>
+</dl>
+</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.LSTM">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">LSTM</code><span class="sig-paren">(</span><em>name</em>, <em>hidden_size</em>, <em>dropout=0.0</em>, <em>num_stacks=1</em>, <em>input_mode='linear'</em>, <em>bidirectional=False</em>, <em>param_specs=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.LSTM" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.RNN" title="singa.layer.RNN"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.RNN</span></code></a></p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.GRU">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">GRU</code><span class="sig-paren">(</span><em>name</em>, <em>hidden_size</em>, <em>dropout=0.0</em>, <em>num_stacks=1</em>, <em>input_mode='linear'</em>, <em>bidirectional=False</em>, <em>param_specs=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.GRU" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.RNN" title="singa.layer.RNN"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.layer.RNN</span></code></a></p>
+</dd></dl>
+
+<dl class="function">
+<dt id="singa.layer.get_layer_list">
+<code class="descclassname">singa.layer.</code><code class="descname">get_layer_list</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.get_layer_list" title="Permalink to this definition">¶</a></dt>
+<dd><p>Return a list of strings which include the identifiers (tags) of all
+supported layers</p>
+</dd></dl>
+
 </div>
 <div class="section" id="cpp-api">
 <h2>CPP API<a class="headerlink" href="#cpp-api" title="Permalink to this headline">¶</a></h2>
@@ -265,9 +1373,7 @@
 
   <script type="text/javascript">
       jQuery(function () {
-          
-          SphinxRtdTheme.Navigation.enableSticky();
-          
+          SphinxRtdTheme.Navigation.enable(true);
       });
   </script>
 

Modified: incubator/singa/site/trunk/en/docs/loss.html
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/en/docs/loss.html?rev=1831260&r1=1831259&r2=1831260&view=diff
==============================================================================
--- incubator/singa/site/trunk/en/docs/loss.html (original)
+++ incubator/singa/site/trunk/en/docs/loss.html Wed May  9 15:25:26 2018
@@ -188,8 +188,224 @@
           <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
            <div itemprop="articleBody">
             
-  <div class="section" id="loss">
-<h1>Loss<a class="headerlink" href="#loss" title="Permalink to this headline">¶</a></h1>
+  <div class="section" id="module-singa.loss">
+<span id="loss"></span><h1>Loss<a class="headerlink" href="#module-singa.loss" title="Permalink to this headline">¶</a></h1>
+<p>Loss module includes a set of training loss implmentations. Some are converted
+from C++ implementation, and the rest are implemented directly using python
+Tensor.</p>
+<p>Example usage:</p>
+<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">tensor</span>
+<span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">loss</span>
+
+<span class="n">x</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">Tensor</span><span class="p">((</span><span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
+<span class="n">x</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>  <span class="c1"># randomly genearte the prediction activation</span>
+<span class="n">y</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">from_numpy</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int</span><span class="p">))</span>  <span class="c1"># set the truth</span>
+
+<span class="n">f</span> <span class="o">=</span> <span class="n">loss</span><span class="o">.</span><span class="n">SoftmaxCrossEntropy</span><span class="p">()</span>
+<span class="n">l</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">forward</span><span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>  <span class="c1"># l is tensor with 3 loss values</span>
+<span class="n">g</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">backward</span><span class="p">()</span>  <span class="c1"># g is a tensor containing all gradients of x w.r.t l</span>
+</pre></div>
+</div>
+<dl class="class">
+<dt id="singa.loss.Loss">
+<em class="property">class </em><code class="descclassname">singa.loss.</code><code class="descname">Loss</code><a class="headerlink" href="#singa.loss.Loss" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">object</span></code></p>
+<p>Base loss class.</p>
+<p>Subclasses that wrap the C++ loss classes can use the inherited foward,
+backward, and evaluate functions of this base class. Other subclasses need
+to override these functions</p>
+<dl class="method">
+<dt id="singa.loss.Loss.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.Loss.backward" title="Permalink to this definition">¶</a></dt>
+<dd><table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">the grad of x w.r.t. the loss</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.Loss.evaluate">
+<code class="descname">evaluate</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.Loss.evaluate" title="Permalink to this definition">¶</a></dt>
+<dd><table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) – must be kEval, to be removed</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the prediction Tensor</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the ground truth Tnesor</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">the averaged loss for all samples in x.</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.Loss.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.Loss.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the loss values.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> – kTrain/kEval or bool. If it is kTrain/True, then the backward
+function must be called before calling forward again.</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the prediction Tensor</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the ground truch Tensor, x.shape[0] must = y.shape[0]</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a tensor of floats for the loss values, one per sample</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.loss.SigmoidCrossEntropy">
+<em class="property">class </em><code class="descclassname">singa.loss.</code><code class="descname">SigmoidCrossEntropy</code><span class="sig-paren">(</span><em>epsilon=1e-08</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SigmoidCrossEntropy" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.loss.Loss" title="singa.loss.Loss"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.loss.Loss</span></code></a></p>
+<p>This loss evaluates the cross-entropy loss between the prediction and the
+truth values with the prediction probability generated from Sigmoid.</p>
+<dl class="method">
+<dt id="singa.loss.SigmoidCrossEntropy.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SigmoidCrossEntropy.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the gradient of loss w.r.t to x.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">dx = pi - yi.</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.SigmoidCrossEntropy.evaluate">
+<code class="descname">evaluate</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SigmoidCrossEntropy.evaluate" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compuate the averaged error.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a float value as the averaged error</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.SigmoidCrossEntropy.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SigmoidCrossEntropy.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>loss is -yi * log pi - (1-yi) log (1-pi), where pi=sigmoid(xi)</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>bool</em>) – true for training; false for evaluation</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the prediction Tensor</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the truth Tensor, a binary array value per sample</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a Tensor with one error value per sample</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.loss.SoftmaxCrossEntropy">
+<em class="property">class </em><code class="descclassname">singa.loss.</code><code class="descname">SoftmaxCrossEntropy</code><a class="headerlink" href="#singa.loss.SoftmaxCrossEntropy" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.loss.Loss" title="singa.loss.Loss"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.loss.Loss</span></code></a></p>
+<p>This loss function is a combination of SoftMax and Cross-Entropy loss.</p>
+<p>It converts the inputs via SoftMax function and then
+computes the cross-entropy loss against the ground truth values.</p>
+<p>For each sample, the ground truth could be a integer as the label index;
+or a binary array, indicating the label distribution. The ground truth
+tensor thus could be a 1d or 2d tensor.
+The data/feature tensor could 1d (for a single sample) or 2d for a batch of
+samples.</p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.loss.SquaredError">
+<em class="property">class </em><code class="descclassname">singa.loss.</code><code class="descname">SquaredError</code><a class="headerlink" href="#singa.loss.SquaredError" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.loss.Loss" title="singa.loss.Loss"><code class="xref py py-class docutils literal notranslate"><span class="pre">singa.loss.Loss</span></code></a></p>
+<p>This loss evaluates the squared error between the prediction and the
+truth values.</p>
+<p>It is implemented using Python Tensor operations.</p>
+<dl class="method">
+<dt id="singa.loss.SquaredError.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SquaredError.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the gradient of x w.r.t the error.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">x - y</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.SquaredError.evaluate">
+<code class="descname">evaluate</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SquaredError.evaluate" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compuate the averaged error.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a float value as the averaged error</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.SquaredError.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SquaredError.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the error as 0.5 * ||x-y||^2.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) – kTrain or kEval; if kTrain, then the backward must be
+called before calling forward again.</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the prediction Tensor</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) – the truth Tensor, an integer value per sample, whose
+value is [0, x.shape[1])</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a Tensor with one error value per sample</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
 </div>
 
 
@@ -255,9 +471,7 @@
 
   <script type="text/javascript">
       jQuery(function () {
-          
-          SphinxRtdTheme.Navigation.enableSticky();
-          
+          SphinxRtdTheme.Navigation.enable(true);
       });
   </script>
 



Mime
View raw message