singa-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wang...@apache.org
Subject svn commit: r1738695 [3/10] - in /incubator/singa/site/trunk/content: ./ markdown/docs/ markdown/docs/zh/ markdown/releases/ markdown/v0.2.0/ markdown/v0.2.0/jp/ markdown/v0.2.0/kr/ markdown/v0.2.0/zh/
Date Tue, 12 Apr 2016 06:22:21 GMT
Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/examples.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/examples.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/examples.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/examples.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,29 @@
+# Example Models
+
+---
+
+Different models are provided as examples to help users get familiar with SINGA.
+[Neural Network](neural-net.html) gives details on the models that are
+supported by SINGA.
+
+
+### Feed-forward neural networks
+
+  * [MultiLayer Perceptron](mlp.html) trained on MNIST dataset for handwritten
+  digits recognition.
+
+  * [Convolutional Neural Network](cnn.html) trained on MNIST and CIFAR10 for
+  image classification.
+
+  * [Deep Auto-Encoders](rbm.html) trained on MNIST for dimensionality
+
+
+### Recurrent neural networks (RNN)
+
+ * [RNN language model](rnn.html) trained on plain text for language modelling.
+
+### Energy models
+
+ * [RBM](rbm.html) used to pre-train deep auto-encoders for dimensionality
+ reduction.
+

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/frameworks.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/frameworks.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/frameworks.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/frameworks.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,119 @@
+# 分散トレーニング フレームワーク
+
+---
+
+## クラスタトポロジーの設定
+
+クラスタトポロジーは `JobProto` の `cluster` フィールドで設定します。
+`cluster` の type は `ClusterProto` です。
+
+    message ClusterProto {
+      optional int32 nworker_groups = 1;
+      optional int32 nserver_groups = 2;
+      optional int32 nworkers_per_group = 3 [default = 1];
+      optional int32 nservers_per_group = 4 [default = 1];
+      optional int32 nworkers_per_procs = 5 [default = 1];
+      optional int32 nservers_per_procs = 6 [default = 1];
+
+      // servers and workers in different processes?
+      optional bool server_worker_separate = 20 [default = false];
+
+      ......
+    }
+
+
+下記のフィールドを設定して、クラスタトポロジーをカスタマイズします。
+
+  * `nworkers_per_group` と `nworkers_per_procs`
+  decide the partitioning of worker side ParamShard.
+
+  * `nservers_per_group` と `nservers_per_procs`
+  decide the partitioning of server side ParamShard.
+
+  * `server_worker_separate`
+  separate servers and workers in different processes.
+
+## トレーニング フレームワーク
+
+In SINGA, worker groups run asynchronously and workers within one group run synchronously.
+Users can leverage this general design to run both **synchronous** and **asynchronous** training frameworks.
+Here we illustrate how to configure popular distributed training frameworks in SINGA.
+
+<img src="../../images/frameworks.png" style="width: 800px"/>
+<p><strong> Fig.1 - Training frameworks in SINGA</strong></p>
+
+### Sandblaster
+
+Google Brain で使われている **synchronous** フレームワークです。
+Fig.2(a)に、SINGA で実装された Sandblaster を示します。
+cluster フィールドの設定は次のとおりです。
+
+    cluster {
+        nworker_groups: 1
+        nserver_groups: 1
+        nworkers_per_group: 3
+        nservers_per_group: 2
+        server_worker_separate: true
+    }
+
+A single server group is launched to handle all requests from workers.
+A worker computes on its partition of the model,
+and only communicates with servers handling related parameters.
+
+
+### AllReduce
+
+Baidu's DeepImage で使われている **synchronous** フレームワークです。
+Fig.2(b)に、SINGA で実装された AllReduce を示します。
+cluster フィールドの設定は次のとおりです。
+
+    cluster {
+        nworker_groups: 1
+        nserver_groups: 1
+        nworkers_per_group: 3
+        nservers_per_group: 3
+        server_worker_separate: false
+    }
+
+We bind each worker with a server on the same node, so that each
+node is responsible for maintaining a partition of parameters and
+collecting updates from all other nodes.
+
+### Downpour
+
+Google Brain で使われている **asynchronous** フレームワークです。
+Fig.2(c)に、SINGA で実装された Downpour を示します。
+cluster フィールドの設定は次のとおりです。
+
+    cluster {
+        nworker_groups: 2
+        nserver_groups: 1
+        nworkers_per_group: 2
+        nservers_per_group: 2
+        server_worker_separate: true
+    }
+
+Similar to the synchronous Sandblaster, all workers send
+requests to a global server group. We divide workers into several
+worker groups, each running independently and working on parameters
+from the last *update* response.
+
+###Distributed Hogwild
+
+Caffe で使われている **asynchronous** フレームワークです。
+Fig.2(d)に、SINGA で実装された Distributed Hogwild を示します。
+cluster フィールドの設定は次のとおりです。
+
+    cluster {
+        nworker_groups: 3
+        nserver_groups: 3
+        nworkers_per_group: 1
+        nservers_per_group: 1
+        server_worker_separate: false
+    }
+
+Each node contains a complete server group and a complete worker group.
+Parameter updates are done locally, so that communication cost
+during each training step is minimized.
+However, the server group must periodically synchronize with
+neighboring groups to improve the training convergence.

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/index.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/index.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/index.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/index.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,23 @@
+# 最新ドキュメント
+
+---
+
+* [イントロダクション](overview.html)
+* [インストール](installation.html)
+* [クイックスタート](quick-start.html)
+* [プログラミング ガイド](programming-guide.html)
+    * [NeuralNet](neural-net.html)
+        * [Layer](layer.html)
+        * [Param](param.html)
+    * [TrainOneBatch](train-one-batch.html)
+    * [Updater](updater.html)
+* [分散 トレーニング](distributed-training.html)
+* [データの準備](data.html)
+* [Checkpoint と Resume](checkpoint.html)
+* [パフォーマンステスト と 特徴抽出](test.html)
+* [サンプル](examples.html)
+    * Feed-forward モデル
+        * [CNN](cnn.html)
+        * [MLP](mlp.html)
+    * [RBM + Auto-encoder](rbm.html)
+    * [RNN](rnn.html)

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/installation.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/installation.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/installation.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/installation.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,8 @@
+# インストール
+
+---
+
+下の2つの方法から選んでください。
+
+* ソースからビルド [Build SINGA directly from source](installation_source.html)
+* Dockerコンテナのビルド [Build SINGA as a Docker container](docker.html)

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/installation_source.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/installation_source.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/installation_source.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/installation_source.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,239 @@
+# ソースからのビルド
+
+---
+
+## Dependencies (依存性)
+
+SINGA は Linux プラットフォームで開発されました。
+
+下記のライブラリを必要とします。ご確認ください。
+
+  * glog version 0.3.3
+
+  * google-protobuf version 2.6.0
+
+  * openblas version >= 0.2.10
+
+  * zeromq version >= 3.2
+
+  * czmq version >= 3
+
+  * zookeeper version 3.4.6
+
+
+オプション
+
+  * lmdb version 0.9.10
+
+
+すべての Dependencies をインストールするには
+
+    # thirdparty フォルダに移動
+    cd thirdparty
+    ./install.sh all $PREFIX
+
+$PREFIX がシステムパス (例 /usr/local/) でない場合は、下記の変数を exportコマンドで設定してください。
+
+    export LD_LIBRARY_PATH=$PREFIX/lib:$LD_LIBRARY_PATH
+    export CPLUS_INCLUDE_PATH=$PREFIX/include:$CPLUS_INCLUDE_PATH
+    export LIBRARY_PATH=$PREFIX/lib:$LIBRARY_PATH
+    export PATH=$PREFIX/bin:$PATH
+
+インストールスクリプトの詳細は、このページの下にあります。
+
+## ソースからのビルド
+
+GNU autotools を使用します。まずは GCC (version >= 4.8) のバージョンを確認してください。
+
+SINGA をビルドする2つの方法を用意しました。
+
+  * Git コマンドを使用して [Github](https://github.com/apache/incubator-singa.git) から最新のソースコード(レポジトリ)をクローンし、次のコマンドを実行します。
+
+        $ git clone git@github.com:apache/incubator-singa.git
+        $ cd incubator-singa
+        $ ./autogen.sh
+        $ ./configure
+        $ make
+
+  Note: [nusinga](https://github.com/orgs/nusinga) アカウントの singa repository は Apache Incubator project として使用していたものなので、最新ではありません。ご注意ください。
+
+  * Release パッケージをダウンロードし、次のコマンドを実行します。
+
+        $ tar xvf singa-xxx
+        $ cd singa-xxx
+        $ ./configure
+        $ make
+
+    ある機能は外部ライブラリに依存します。
+    それらの機能を利用するために、`--enable-<feature>` を付けてコンパイルしてください。
+    例えば、lmdb サポートを追加するには
+
+        $ ./configure --enable-lmdb
+
+<!---
+Zhongle: please update the code to use the follow command
+
+    $ make test
+
+After compilation, you will find the binary file singatest. Just run it!
+More details about configure script can be found by running:
+
+		$ ./configure -h
+-->
+
+うまくコンパイルが成功すると、*.libs/* フォルダ内に *libsinga.so* と実行ファイル *singa* が生成されます。
+
+特定の dependent ライブラリが見つからない場合、次のスクリプトを実行してください。
+<!---
+to be updated after zhongle changes the code to use
+
+    ./install.sh libname \-\-prefix=
+
+-->
+    # thirdparty フォルダに移動
+    $ cd thirdparty
+    $ ./install.sh LIB_NAME PREFIX
+
+インストールパスを指定しない場合、ライブラリは(ソフトウェアが指定した)デフォルトのフォルダにインストールされます。
+例えば、デフォルトのフォルダに `zeromq` ライブラリをインストールするには、次のコマンドを
+
+    $ ./install.sh zeromq
+
+別のフォルダ(e.g., PREFIX)にインストールするには、次のコマンドを
+
+    $ ./install.sh zeromq PREFIX
+
+*/usr/local* フォルダにすべての Dependencies をインストールするには、次のコマンドを
+
+    $ ./install.sh all /usr/local
+
+ライブラリ名(LIB_NAME)は以下のとおりです。
+
+    LIB_NAME          Library
+    czmq(注)        czmq lib
+    glog              glog lib
+    lmdb              lmdb lib
+    OpenBLAS          OpenBLAS lib
+    protobuf          Google protobuf
+    zeromq            zeromq lib
+    zookeeper         Apache zookeeper
+
+(注) `czmq` をインストールする時、`zeromq` のパスを -f オプションで指定する必要があります。
+コマンドは次のとおりです。
+
+<!---
+to be updated to
+
+    $./install.sh czmq  \-\-prefix=/usr/local \-\-zeromq=/usr/local/zeromq
+-->
+
+    $./install.sh czmq  /usr/local -f=/usr/local/zeromq
+
+結果、*/usr/local* フォルダに `czmq` がインストールされます。
+
+### FAQ
+* Q1:I get error `./configure --> cannot find blas_segmm() function` even I
+have installed OpenBLAS.
+
+  A1: This means the compiler cannot find the `OpenBLAS` library. If you installed
+  it to $PREFIX (e.g., /opt/OpenBLAS), then you need to export it as
+
+      $ export LIBRARY_PATH=$PREFIX/lib:$LIBRARY_PATH
+      # e.g.,
+      $ export LIBRARY_PATH=/opt/OpenBLAS/lib:$LIBRARY_PATH
+
+
+* Q2: I get error `cblas.h no such file or directory exists`.
+
+  Q2: You need to include the folder of the cblas.h into CPLUS_INCLUDE_PATH,
+  e.g.,
+
+      $ export CPLUS_INCLUDE_PATH=$PREFIX/include:$CPLUS_INCLUDE_PATH
+      # e.g.,
+      $ export CPLUS_INCLUDE_PATH=/opt/OpenBLAS/include:$CPLUS_INCLUDE_PATH
+      # then reconfigure and make SINGA
+      $ ./configure
+      $ make
+
+
+* Q3:While compiling SINGA, I get error `SSE2 instruction set not enabled`
+
+  A3:You can try following command:
+
+      $ make CFLAGS='-msse2' CXXFLAGS='-msse2'
+
+
+* Q4:I get `ImportError: cannot import name enum_type_wrapper` from
+google.protobuf.internal when I try to import .py files.
+
+  A4:After install google protobuf by `make install`, we should install python
+  runtime libraries. Go to protobuf source directory, run:
+
+      $ cd /PROTOBUF/SOURCE/FOLDER
+      $ cd python
+      $ python setup.py build
+      $ python setup.py install
+
+  You may need `sudo` when you try to install python runtime libraries in
+  the system folder.
+
+
+* Q5: I get a linking error caused by gflags.
+
+  A5: SINGA does not depend on gflags. But you may have installed the glog with
+  gflags. In that case you can reinstall glog using *thirdparty/install.sh* into
+  a another folder and export the LDFLAGS and CPPFLAGS to include that folder.
+
+
+* Q6: While compiling SINGA and installing `glog` on mac OS X, I get fatal error
+`'ext/slist' file not found`
+
+  A6:Please install `glog` individually and try :
+
+      $ make CFLAGS='-stdlib=libstdc++' CXXFLAGS='stdlib=libstdc++'
+
+* Q7: When I start a training job, it reports error related with "ZOO_ERROR...zk retcode=-4...".
+
+  A7: This is because the zookeeper is not started. Please start the zookeeper service
+
+      $ ./bin/zk-service start
+
+  If the error still exists, probably that you do not have java. You can simple
+  check it by
+
+      $ java --version
+
+* Q8: When I build OpenBLAS from source, I am told that I need a fortran compiler.
+
+  A8: You can compile OpenBLAS by
+
+      $ make ONLY_CBLAS=1
+
+  or install it using
+
+	    $ sudo apt-get install openblas-dev
+
+  or
+
+	    $ sudo yum install openblas-devel
+
+  It is worth noting that you need root access to run the last two commands.
+  Remember to set the environment variables to include the header and library
+  paths of OpenBLAS after installation (please refer to the Dependencies section).
+
+* Q9: When I build protocol buffer, it reports that GLIBC++_3.4.20 not found in /usr/lib64/libstdc++.so.6.
+
+  A9: This means the linker found libstdc++.so.6 but that library
+  belongs to an older version of GCC than was used to compile and link the
+  program. The program depends on code defined in
+  the newer libstdc++ that belongs to the newer version of GCC, so the linker
+  must be told how to find the newer libstdc++ shared library.
+  The simplest way to fix this is to find the correct libstdc++ and export it to
+  LD_LIBRARY_PATH. For example, if GLIBC++_3.4.20 is listed in the output of the
+  following command,
+
+      $ strings /usr/local/lib64/libstdc++.so.6|grep GLIBC++
+
+  then you just set your environment variable as
+
+      $ export LD_LIBRARY_PATH=/usr/local/lib64:$LD_LIBRARY_PATH

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/layer.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/layer.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/layer.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/layer.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,614 @@
+# Layers
+
+---
+
+Layer is a core abstraction in SINGA. It performs a variety of feature
+transformations for extracting high-level features, e.g., loading raw features,
+parsing RGB values, doing convolution transformation, etc.
+
+The *Basic user guide* section introduces the configuration of a built-in
+layer. *Advanced user guide* explains how to extend the base Layer class to
+implement users' functions.
+
+## Basic user guide
+
+### Layer configuration
+
+Configuration of two example layers are shown below,
+
+    layer {
+      name: "data"
+      type: kCSVRecord
+      store_conf { }
+    }
+    layer{
+      name: "fc1"
+      type: kInnerProduct
+      srclayers: "data"
+      innerproduct_conf{ }
+      param{ }
+    }
+
+There are some common fields for all kinds of layers:
+
+  * `name`: a string used to differentiate two layers in a neural net.
+  * `type`: an integer used for identifying a specific Layer subclass. The types of built-in
+  layers are listed in LayerType (defined in job.proto).
+  For user-defined layer subclasses, `user_type` should be used instead of `type`.
+  * `srclayers`: names of the source layers.
+  In SINGA, all connections are [converted](neural-net.html) to directed connections.
+  * `param`: configuration for a [Param](param.html) instance.
+  There can be multiple Param objects in one layer.
+
+Different layers may have different configurations. These configurations
+are defined in `<type>_conf`.  E.g., "fc1" layer has
+`innerproduct_conf`. The subsequent sections
+explain the functionality of each built-in layer and how to configure it.
+
+### Built-in Layer subclasses
+SINGA has provided many built-in layers, which can be used directly to create neural nets.
+These layers are categorized according to their functionalities,
+
+  * Input layers for loading records (e.g., images) from disk files, HDFS or network into memory.
+  * Neuron layers for feature transformation, e.g., [convolution](../api/classsinga_1_1ConvolutionLayer.html), [pooling](../api/classsinga_1_1PoolingLayer.html), dropout, etc.
+  * Loss layers for measuring the training objective loss, e.g., Cross Entropy loss or Euclidean loss.
+  * Output layers for outputting the prediction results (e.g., probabilities of each category) or features into persistent storage, e.g., disk or HDFS.
+  * Connection layers for connecting layers when the neural net is partitioned.
+
+#### Input layers
+
+Input layers load training/test data from disk or other places (e.g., HDFS or network)
+into memory.
+
+##### StoreInputLayer
+
+[StoreInputLayer](../api/classsinga_1_1StoreInputLayer.html) is a base layer for
+loading data from data store. The data store can be a KVFile or TextFile (LMDB,
+LevelDB, HDFS, etc., will be supported later). Its `ComputeFeature` function reads
+batchsize (string:key, string:value) tuples. Each tuple is parsed by a `Parse` function
+implemented by its subclasses.
+
+The configuration for this layer is in `store_conf`,
+
+    store_conf {
+      backend: # "kvfile" or "textfile"
+      path: # path to the data store
+      batchsize :
+      ...
+    }
+
+##### SingleLabelRecordLayer
+
+It is a subclass of StoreInputLayer. It assumes the (key, value) tuple loaded
+from a data store contains a feature vector (and a label) for one data instance.
+All feature vectors are of the same fixed length. The shape of one instance
+is configured through the `shape` field, e.g., the following configuration
+specifies the shape for the CIFAR10 images.
+
+    store_conf {
+      shape: 3  #channels
+      shape: 32 #height
+      shape: 32 #width
+    }
+
+It may do some preprocessing like [standardization](http://ufldl.stanford.edu/wiki/index.php/Data_Preprocessing).
+The data for preprocessing is loaded by and parsed in a virtual function, which is implemented by
+its subclasses.
+
+##### RecordInputLayer
+
+It is a subclass of SingleLabelRecordLayer. It parses the value field from one
+tuple into a RecordProto, which is generated by Google Protobuf according
+to common.proto.  It can be used to store features for images (e.g., using the pixel field)
+or other objects (using the data field). The key field is not parsed.
+
+    type: kRecordInput
+    store_conf {
+      has_label: # default is true
+      ...
+    }
+
+##### CSVInputLayer
+
+It is a subclass of SingleLabelRecordLayer. The value field from one tuple is parsed
+as a CSV line (separated by comma). The first number would be parsed as a label if
+`has_label` is configured in `store_conf`. Otherwise, all numbers would be parsed
+into one row of the `data_` Blob.
+
+    type: kCSVInput
+    store_conf {
+      has_label: # default is true
+      ...
+    }
+
+##### ImagePreprocessLayer
+
+This layer does image preprocessing, e.g., cropping, mirroring and scaling, against
+the data Blob from its source layer. It deprecates the RGBImageLayer which
+works on the Record from ShardDataLayer. It still uses the same configuration as
+RGBImageLayer,
+
+    type: kImagePreprocess
+    rgbimage_conf {
+      scale: float
+      cropsize: int  # cropping each image to keep the central part with this size
+      mirror: bool  # mirror the image by set image[i,j]=image[i,len-j]
+      meanfile: "Image_Mean_File_Path"
+    }
+
+##### ShardDataLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[ShardDataLayer](../api/classsinga_1_1ShardDataLayer.html) is a subclass of DataLayer,
+which reads Records from disk file. The file should be created using
+[DataShard](../api/classsinga_1_1DataShard.html)
+class. With the data file prepared, users configure the layer as
+
+    type: kShardData
+    sharddata_conf {
+      path: "path to data shard folder"
+      batchsize: int
+      random_skip: int
+    }
+
+`batchsize` specifies the number of records to be trained for one mini-batch.
+The first `rand() % random_skip` `Record`s will be skipped at the first
+iteration. This is to enforce that different workers work on different Records.
+
+##### LMDBDataLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[LMDBDataLayer] is similar to ShardDataLayer, except that the Records are
+loaded from LMDB.
+
+    type: kLMDBData
+    lmdbdata_conf {
+      path: "path to LMDB folder"
+      batchsize: int
+      random_skip: int
+    }
+
+##### ParserLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+It get a vector of Records from DataLayer and parse features into
+a Blob.
+
+    virtual void ParseRecords(Phase phase, const vector<Record>& records, Blob<float>* blob) = 0;
+
+
+##### LabelLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[LabelLayer](../api/classsinga_1_1LabelLayer.html) is a subclass of ParserLayer.
+It parses a single label from each Record. Consequently, it
+will put $b$ (mini-batch size) values into the Blob. It has no specific configuration fields.
+
+
+##### MnistImageLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+[MnistImageLayer] is a subclass of ParserLayer. It parses the pixel values of
+each image from the MNIST dataset. The pixel
+values may be normalized as `x/norm_a - norm_b`. For example, if `norm_a` is
+set to 255 and `norm_b` is set to 0, then every pixel will be normalized into
+[0, 1].
+
+    type: kMnistImage
+    mnistimage_conf {
+      norm_a: float
+      norm_b: float
+    }
+
+##### RGBImageLayer (Deprected)
+Deprected! Please use the ImagePreprocessLayer.
+[RGBImageLayer](../api/classsinga_1_1RGBImageLayer.html) is a subclass of ParserLayer.
+It parses the RGB values of one image from each Record. It may also
+apply some transformations, e.g., cropping, mirroring operations. If the
+`meanfile` is specified, it should point to a path that contains one Record for
+the mean of each pixel over all training images.
+
+    type: kRGBImage
+    rgbimage_conf {
+      scale: float
+      cropsize: int  # cropping each image to keep the central part with this size
+      mirror: bool  # mirror the image by set image[i,j]=image[i,len-j]
+      meanfile: "Image_Mean_File_Path"
+    }
+
+##### PrefetchLayer
+
+[PrefetchLayer](../api/classsinga_1_1PrefetchLayer.html) embeds other input layers
+to do data prefeching.  It will launch a thread to call the embedded layers to load and extract features.
+It ensures that the I/O task and computation task can work simultaneously.
+One example PrefetchLayer configuration is,
+
+    layer {
+      name: "prefetch"
+      type: kPrefetch
+      sublayers {
+        name: "data"
+        type: kShardData
+        sharddata_conf { }
+      }
+      sublayers {
+        name: "rgb"
+        type: kRGBImage
+        srclayers:"data"
+        rgbimage_conf { }
+      }
+      sublayers {
+        name: "label"
+        type: kLabel
+        srclayers: "data"
+      }
+      exclude:kTest
+    }
+
+The layers on top of the PrefetchLayer should use the name of the embedded
+layers as their source layers. For example, the "rgb" and "label" should be
+configured to the `srclayers` of other layers.
+
+
+#### Output Layers
+
+Output layers get data from their source layers and write them to persistent storage,
+e.g., disk files or HDFS (to be supported).
+
+##### RecordOutputLayer
+
+This layer gets data (and label if it is available) from its source layer and converts it into records of type
+RecordProto. Records are written as (key = instance No., value = serialized record) tuples into Store, e.g., KVFile. The configuration of this layer
+should include the specifics of the Store backend via `store_conf`.
+
+    layer {
+      name: "output"
+      type: kRecordOutput
+      srclayers:
+      store_conf {
+        backend: "kvfile"
+        path:
+      }
+    }
+
+##### CSVOutputLayer
+This layer gets data (and label if it available) from its source layer and converts it into
+a string per instance with fields separated by commas (i.e., CSV format). The shape information
+is not kept in the string. All strings are written into
+Store, e.g., text file. The configuration of this layer should include the specifics of the Store backend via `store_conf`.
+
+    layer {
+      name: "output"
+      type: kCSVOutput
+      srclayers:
+      store_conf {
+        backend: "textfile"
+        path:
+      }
+    }
+
+#### Neuron Layers
+
+Neuron layers conduct feature transformations.
+
+##### ConvolutionLayer
+
+[ConvolutionLayer](../api/classsinga_1_1ConvolutionLayer.html) conducts convolution transformation.
+
+    type: kConvolution
+    convolution_conf {
+      num_filters: int
+      kernel: int
+      stride: int
+      pad: int
+    }
+    param { } # weight/filter matrix
+    param { } # bias vector
+
+The int value `num_filters` stands for the count of the applied filters; the int
+value `kernel` stands for the convolution kernel size (equal width and height);
+the int value `stride` stands for the distance between the successive filters;
+the int value `pad` pads each with a given int number of pixels border of
+zeros.
+
+##### InnerProductLayer
+
+[InnerProductLayer](../api/classsinga_1_1InnerProductLayer.html) is fully connected with its (single) source layer.
+Typically, it has two parameter fields, one for weight matrix, and the other
+for bias vector. It rotates the feature of the source layer (by multiplying with weight matrix) and
+shifts it (by adding the bias vector).
+
+    type: kInnerProduct
+    innerproduct_conf {
+      num_output: int
+    }
+    param { } # weight matrix
+    param { } # bias vector
+
+
+##### PoolingLayer
+
+[PoolingLayer](../api/classsinga_1_1PoolingLayer.html) is used to do a normalization (or averaging or sampling) of the
+feature vectors from the source layer.
+
+    type: kPooling
+    pooling_conf {
+      pool: AVE|MAX // Choose whether use the Average Pooling or Max Pooling
+      kernel: int   // size of the kernel filter
+      pad: int      // the padding size
+      stride: int   // the step length of the filter
+    }
+
+The pooling layer has two methods: Average Pooling and Max Pooling.
+Use the enum AVE and MAX to choose the method.
+
+  * Max Pooling selects the max value for each filtering area as a point of the
+  result feature blob.
+  * Average Pooling averages all values for each filtering area at a point of the
+    result feature blob.
+
+##### ReLULayer
+
+[ReLuLayer](../api/classsinga_1_1ReLULayer.html) has rectified linear neurons, which conducts the following
+transformation, `f(x) = Max(0, x)`. It has no specific configuration fields.
+
+##### STanhLayer
+
+[STanhLayer](../api/classsinga_1_1TanhLayer.html) uses the scaled tanh as activation function, i.e., `f(x)=1.7159047* tanh(0.6666667 * x)`.
+It has no specific configuration fields.
+
+##### SigmoidLayer
+
+[SigmoidLayer] uses the sigmoid (or logistic) as activation function, i.e.,
+`f(x)=sigmoid(x)`.  It has no specific configuration fields.
+
+
+##### Dropout Layer
+[DropoutLayer](../api/asssinga_1_1DropoutLayer.html) is a layer that randomly dropouts some inputs.
+This scheme helps deep learning model away from over-fitting.
+
+    type: kDropout
+    dropout_conf {
+      dropout_ratio: float # dropout probability
+    }
+
+##### LRNLayer
+[LRNLayer](../api/classsinga_1_1LRNLayer.html), (Local Response Normalization), normalizes over the channels.
+
+    type: kLRN
+    lrn_conf {
+      local_size: int
+      alpha: float  // scaling parameter
+      beta: float   // exponential number
+    }
+
+`local_size` specifies  the quantity of the adjoining channels which will be summed up.
+ For `WITHIN_CHANNEL`, it means the side length of the space region which will be summed up.
+
+
+#### Loss Layers
+
+Loss layers measures the objective training loss.
+
+##### SoftmaxLossLayer
+
+[SoftmaxLossLayer](../api/classsinga_1_1SoftmaxLossLayer.html) is a combination of the Softmax transformation and
+Cross-Entropy loss. It applies Softmax firstly to get a prediction probability
+for each output unit (neuron) and compute the cross-entropy against the ground truth.
+It is generally used as the final layer to generate labels for classification tasks.
+
+    type: kSoftmaxLoss
+    softmaxloss_conf {
+      topk: int
+    }
+
+The configuration field `topk` is for selecting the labels with `topk`
+probabilities as the prediction results. It is tedious for users to view the
+prediction probability of every label.
+
+#### ConnectionLayer
+
+Subclasses of ConnectionLayer are utility layers that connects other layers due
+to neural net partitioning or other cases.
+
+##### ConcateLayer
+
+[ConcateLayer](../api/classsinga_1_1ConcateLayer.html) connects more than one source layers to concatenate their feature
+blob along given dimension.
+
+    type: kConcate
+    concate_conf {
+      concate_dim: int  // define the dimension
+    }
+
+##### SliceLayer
+
+[SliceLayer](../api/classsinga_1_1SliceLayer.html) connects to more than one destination layers to slice its feature
+blob along given dimension.
+
+    type: kSlice
+    slice_conf {
+      slice_dim: int
+    }
+
+##### SplitLayer
+
+[SplitLayer](../api/classsinga_1_1SplitLayer.html) connects to more than one destination layers to replicate its
+feature blob.
+
+    type: kSplit
+    split_conf {
+      num_splits: int
+    }
+
+##### BridgeSrcLayer & BridgeDstLayer
+
+[BridgeSrcLayer](../api/classsinga_1_1BridgeSrcLayer.html) &
+[BridgeDstLayer](../api/classsinga_1_1BridgeDstLayer.html) are utility layers assisting data (e.g., feature or
+gradient) transferring due to neural net partitioning. These two layers are
+added implicitly. Users typically do not need to configure them in their neural
+net configuration.
+
+### OutputLayer
+
+It write the prediction results or the extracted features into file, HTTP stream
+or other places. Currently SINGA has not implemented any specific output layer.
+
+## Advanced user guide
+
+The base Layer class is introduced in this section, followed by how to
+implement a new Layer subclass.
+
+### Base Layer class
+
+#### Members
+
+    LayerProto layer_conf_;
+    Blob<float> data_, grad_;
+    vector<AuxType> aux_data_;
+
+The base layer class keeps the user configuration in `layer_conf_`.
+Almost all layers has $b$ (mini-batch size) feature vectors, which are stored
+in the `data_` [Blob](../api/classsinga_1_1Blob.html) (A Blob is a chunk of memory space, proposed in
+[Caffe](http://caffe.berkeleyvision.org/)).
+There are layers without feature vectors; instead, they share the data from
+source layers.
+The `grad_` Blob is for storing the gradients of the
+objective loss w.r.t. the `data_` Blob. It is necessary in [BP algorithm](../api/classsinga_1_1BPWorker.html),
+hence we put it as a member of the base class. For [CD algorithm](../api/classsinga_1_1CDWorker.html), the `grad_`
+field is not used; instead, the layers for the RBM model may have a Blob for the positive
+phase feature and a Blob for the negative phase feature. For a recurrent layer
+in RNN, one row of the feature blob corresponds to the feature of one internal layer.
+The `aux_data_` stores the auxiliary data, e.g., image label (set `AuxType` to int).
+If images have variant number of labels, the AuxType can be defined to `vector<int>`.
+Currently, we hard code `AuxType` to int. It will be added as a template argument of Layer class later.
+
+If a layer has parameters, these parameters are declared using type
+[Param](param.html). Since some layers do not have
+parameters, we do not declare any `Param` in the base layer class.
+
+#### Functions
+
+    virtual void Setup(const LayerProto& conf, const vector<Layer*>& srclayers);
+    virtual void ComputeFeature(int flag, const vector<Layer*>& srclayers) = 0;
+    virtual void ComputeGradient(int flag, const vector<Layer*>& srclayers) = 0;
+
+The `Setup` function reads user configuration, i.e. `conf`, and information
+from source layers, e.g., mini-batch size,  to set the
+shape of the `data_` (and `grad_`) field as well
+as some other layer specific fields.
+<!---
+If `npartitions` is larger than 1, then
+users need to reduce the sizes of `data_`, `grad_` Blobs or Param objects. For
+example, if the `partition_dim=0` and there is no source layer, e.g., this
+layer is a (bottom) data layer, then its `data_` and `grad_` Blob should have
+`b/npartitions` feature vectors; If the source layer is also partitioned on
+dimension 0, then this layer should have the same number of feature vectors as
+the source layer. More complex partition cases are discussed in
+[Neural net partitioning](neural-net.html#neural-net-partitioning). Typically, the
+Setup function just set the shapes of `data_` Blobs and Param objects.
+-->
+Memory will not be allocated until computation over the data structure happens.
+
+The `ComputeFeature` function evaluates the feature blob by transforming (e.g.
+convolution and pooling) features from the source layers.  `ComputeGradient`
+computes the gradients of parameters associated with this layer.  These two
+functions are invoked by the [TrainOneBatch](train-one-batch.html)
+function during training. Hence, they should be consistent with the
+`TrainOneBatch` function. Particularly, for feed-forward and RNN models, they are
+trained using [BP algorithm](train-one-batch.html#back-propagation),
+which requires each layer's `ComputeFeature`
+function to compute `data_` based on source layers, and requires each layer's
+`ComputeGradient` to compute gradients of parameters and source layers'
+`grad_`. For energy models, e.g., RBM, they are trained by
+[CD algorithm](train-one-batch.html#contrastive-divergence), which
+requires each layer's `ComputeFeature` function to compute the feature vectors
+for the positive phase or negative phase depending on the `phase` argument, and
+requires the `ComputeGradient` function to only compute parameter gradients.
+For some layers, e.g., loss layer or output layer, they can put the loss or
+prediction result into the `metric` argument, which will be averaged and
+displayed periodically.
+
+### Implementing a new Layer subclass
+
+Users can extend the Layer class or other subclasses to implement their own feature transformation
+logics as long as the two virtual functions are overridden to be consistent with
+the `TrainOneBatch` function. The `Setup` function may also be overridden to
+read specific layer configuration.
+
+The [RNNLM](rnn.html) provides a couple of user-defined layers. You can refer to them as examples.
+
+#### Layer specific protocol message
+
+To implement a new layer, the first step is to define the layer specific
+configuration. Suppose the new layer is `FooLayer`, the layer specific
+google protocol message `FooLayerProto` should be defined as
+
+    # in user.proto
+    package singa
+    import "job.proto"
+    message FooLayerProto {
+      optional int32 a = 1;  // specific fields to the FooLayer
+    }
+
+In addition, users need to extend the original `LayerProto` (defined in job.proto of SINGA)
+to include the `foo_conf` as follows.
+
+    extend LayerProto {
+      optional FooLayerProto foo_conf = 101;  // unique field id, reserved for extensions
+    }
+
+If there are multiple new layers, then each layer that has specific
+configurations would have a `<type>_conf` field and takes one unique extension number.
+SINGA has reserved enough extension numbers, e.g., starting from 101 to 1000.
+
+    # job.proto of SINGA
+    LayerProto {
+      ...
+      extensions 101 to 1000;
+    }
+
+With user.proto defined, users can use
+[protoc](https://developers.google.com/protocol-buffers/) to generate the `user.pb.cc`
+and `user.pb.h` files.  In users' code, the extension fields can be accessed via,
+
+    auto conf = layer_proto_.GetExtension(foo_conf);
+    int a = conf.a();
+
+When defining configurations of the new layer (in job.conf), users should use
+`user_type` for its layer type instead of `type`. In addition, `foo_conf`
+should be enclosed in brackets.
+
+    layer {
+      name: "foo"
+      user_type: "kFooLayer"  # Note user_type of user-defined layers is string
+      [foo_conf] {      # Note there is a pair of [] for extension fields
+        a: 10
+      }
+    }
+
+#### New Layer subclass declaration
+
+The new layer subclass can be implemented like the built-in layer subclasses.
+
+    class FooLayer : public singa::Layer {
+     public:
+      void Setup(const LayerProto& conf, const vector<Layer*>& srclayers) override;
+      void ComputeFeature(int flag, const vector<Layer*>& srclayers) override;
+      void ComputeGradient(int flag, const vector<Layer*>& srclayers) override;
+
+     private:
+      //  members
+    };
+
+Users must override the two virtual functions to be called by the
+`TrainOneBatch` for either BP or CD algorithm. Typically, the `Setup` function
+will also be overridden to initialize some members. The user configured fields
+can be accessed through `layer_conf_` as shown in the above paragraphs.
+
+#### New Layer subclass registration
+
+The newly defined layer should be registered in [main.cc](http://singa.incubator.apache.org/docs/programming-guide) by adding
+
+    driver.RegisterLayer<FooLayer, std::string>("kFooLayer"); // "kFooLayer" should be matched to layer configurations in job.conf.
+
+After that, the [NeuralNet](neural-net.html) can create instances of the new Layer subclass.

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/mesos.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/mesos.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/mesos.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/mesos.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,84 @@
+#Distributed Training on Mesos
+
+This guide explains how to start SINGA distributed training on a Mesos cluster. It assumes that both Mesos and HDFS are already running, and every node has SINGA installed.
+We assume the architecture depicted below, in which a cluster nodes are Docker container. Refer to [Docker guide](docker.html) for details of how to start individual nodes and set up network connection between them (make sure [weave](http://weave.works/guides/weave-docker-ubuntu-simple.html) is running at each node, and the cluster's headnode is running in container `node0`)
+
+![Nothing](http://www.comp.nus.edu.sg/~dinhtta/files/singa_mesos.png)
+
+---
+
+## Start HDFS and Mesos
+Go inside each container, using:
+````
+docker exec -it nodeX /bin/bash
+````
+and configure it as follows:
+
+* On container `node0`
+
+        hadoop namenode -format
+        hadoop-daemon.sh start namenode
+        /opt/mesos-0.22.0/build/bin/mesos-master.sh --work_dir=/opt --log_dir=/opt --quiet > /dev/null &
+        zk-service.sh start
+
+* On container `node1, node2, ...`
+
+        hadoop-daemon.sh start datanode
+        /opt/mesos-0.22.0/build/bin/mesos-slave.sh --master=node0:5050 --log_dir=/opt --quiet > /dev/null &
+
+To check if the setup has been successful, check that HDFS namenode has registered `N` datanodes, via:
+
+````
+hadoop dfsadmin -report
+````
+
+#### Mesos logs
+Mesos logs are stored at `/opt/lt-mesos-master.INFO` on `node0` and `/opt/lt-mesos-slave.INFO` at other nodes.
+
+---
+
+## Starting SINGA training on Mesos
+Assumed that Mesos and HDFS are already started, SINGA job can be launched at **any** container.
+
+#### Launching job
+
+1. Log in to any container, then
+        cd incubator-singa/tool/mesos
+<a name="job_start"></a>
+2. Check that configuration files are correct:
+    + `scheduler.conf` contains information about the master nodes
+    + `singa.conf` contains information about Zookeeper node0
+    + Job configuration file `job.conf` **contains full path to the examples directories (NO RELATIVE PATH!).**
+3. Start the job:
+    + If starting for the first time:
+
+	          ./scheduler <job config file> -scheduler_conf <scheduler config file> -singa_conf <SINGA config file>
+    + If not the first time:
+
+	          ./scheduler <job config file>
+
+**Notes.** Each running job is given a `frameworkID`. Look for the log message of the form:
+
+             Framework registered with XXX-XXX-XXX-XXX-XXX-XXX
+
+#### Monitoring and Debugging
+
+Each Mesos job is given a `frameworkID` and a *sandbox* directory is created for each job.
+The directory is in the specified `work_dir` (or `/tmp/mesos`) by default. For example, the error
+during SINGA execution can be found at:
+
+            /tmp/mesos/slaves/xxxxx-Sx/frameworks/xxxxx/executors/SINGA_x/runs/latest/stderr
+
+Other artifacts, like files downloaded from HDFS (`job.conf`) and `stdout` can be found in the same
+directory.
+
+#### Stopping
+
+There are two way to kill the running job:
+
+1. If the scheduler is running in the foreground, simply kill it (using `Ctrl-C`, for example).
+
+2. If the scheduler is running in the background, kill it using Mesos's REST API:
+
+          curl -d "frameworkId=XXX-XXX-XXX-XXX-XXX-XXX" -X POST http://<master>/master/shutdown
+

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/mlp.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/mlp.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/mlp.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/mlp.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,195 @@
+# MLP Example
+
+---
+
+Multilayer perceptron (MLP) is a subclass of feed-forward neural networks.
+A MLP typically consists of multiple directly connected layers, with each layer fully
+connected to the next one. In this example, we will use SINGA to train a
+[simple MLP model proposed by Ciresan](http://arxiv.org/abs/1003.0358)
+for classifying handwritten digits from the [MNIST dataset](http://yann.lecun.com/exdb/mnist/).
+
+## Running instructions
+
+Please refer to the [installation](installation.html) page for
+instructions on building SINGA, and the [quick start](quick-start.html)
+for instructions on starting zookeeper.
+
+We have provided scripts for preparing the training and test dataset in *examples/cifar10/*.
+
+    # in examples/mnist
+    $ cp Makefile.example Makefile
+    $ make download
+    $ make create
+
+After the datasets are prepared, we start the training by
+
+    ./bin/singa-run.sh -conf examples/mnist/job.conf
+
+After it is started, you should see output like
+
+    Record job information to /tmp/singa-log/job-info/job-1-20150817-055231
+    Executing : ./singa -conf /xxx/incubator-singa/examples/mnist/job.conf -singa_conf /xxx/incubator-singa/conf/singa.conf -singa_job 1
+    E0817 07:15:09.211885 34073 cluster.cc:51] proc #0 -> 192.168.5.128:49152 (pid = 34073)
+    E0817 07:15:14.972231 34114 server.cc:36] Server (group = 0, id = 0) start
+    E0817 07:15:14.972520 34115 worker.cc:134] Worker (group = 0, id = 0) start
+    E0817 07:15:24.462602 34073 trainer.cc:373] Test step-0, loss : 2.341021, accuracy : 0.109100
+    E0817 07:15:47.341076 34073 trainer.cc:373] Train step-0, loss : 2.357269, accuracy : 0.099000
+    E0817 07:16:07.173364 34073 trainer.cc:373] Train step-10, loss : 2.222740, accuracy : 0.201800
+    E0817 07:16:26.714855 34073 trainer.cc:373] Train step-20, loss : 2.091030, accuracy : 0.327200
+    E0817 07:16:46.590946 34073 trainer.cc:373] Train step-30, loss : 1.969412, accuracy : 0.442100
+    E0817 07:17:06.207080 34073 trainer.cc:373] Train step-40, loss : 1.865466, accuracy : 0.514800
+    E0817 07:17:25.890033 34073 trainer.cc:373] Train step-50, loss : 1.773849, accuracy : 0.569100
+    E0817 07:17:51.208935 34073 trainer.cc:373] Test step-60, loss : 1.613709, accuracy : 0.662100
+    E0817 07:17:53.176766 34073 trainer.cc:373] Train step-60, loss : 1.659150, accuracy : 0.652600
+    E0817 07:18:12.783370 34073 trainer.cc:373] Train step-70, loss : 1.574024, accuracy : 0.666000
+    E0817 07:18:32.904942 34073 trainer.cc:373] Train step-80, loss : 1.529380, accuracy : 0.670500
+    E0817 07:18:52.608111 34073 trainer.cc:373] Train step-90, loss : 1.443911, accuracy : 0.703500
+    E0817 07:19:12.168465 34073 trainer.cc:373] Train step-100, loss : 1.387759, accuracy : 0.721000
+    E0817 07:19:31.855865 34073 trainer.cc:373] Train step-110, loss : 1.335246, accuracy : 0.736500
+    E0817 07:19:57.327133 34073 trainer.cc:373] Test step-120, loss : 1.216652, accuracy : 0.769900
+
+After the training of some steps (depends on the setting) or the job is
+finished, SINGA will [checkpoint](checkpoint.html) the model parameters.
+
+## Details
+
+To train a model in SINGA, you need to prepare the datasets,
+and a job configuration which specifies the neural net structure, training
+algorithm (BP or CD), SGD update algorithm (e.g. Adagrad),
+number of training/test steps, etc.
+
+### Data preparation
+
+Before using SINGA, you need to write a program to pre-process the dataset you
+use to a format that SINGA can read. Please refer to the
+[Data Preparation](data.html) to get details about preparing
+this MNIST dataset.
+
+
+### Neural net
+
+<div style = "text-align: center">
+<img src = "../images/example-mlp.png" style = "width: 230px">
+<br/><strong>Figure 1 - Net structure of the MLP example. </strong></img>
+</div>
+
+
+Figure 1 shows the structure of the simple MLP model, which is constructed following
+[Ciresan's paper](http://arxiv.org/abs/1003.0358). The dashed circle contains
+two layers which represent one feature transformation stage. There are 6 such
+stages in total. They sizes of the [InnerProductLayer](layer.html#innerproductlayer)s in these circles decrease from
+2500->2000->1500->1000->500->10.
+
+Next we follow the guide in [neural net page](neural-net.html)
+and [layer page](layer.html) to write the neural net configuration.
+
+* We configure an input layer to read the training/testing records from a disk file.
+
+        layer {
+            name: "data"
+            type: kRecordInput
+            store_conf {
+              backend: "kvfile"
+              path: "examples/mnist/train_data.bin"
+              random_skip: 5000
+              batchsize: 64
+              shape: 784
+              std_value: 127.5
+              mean_value: 127.5
+             }
+             exclude: kTest
+          }
+
+        layer {
+            name: "data"
+            type: kRecordInput
+            store_conf {
+              backend: "kvfile"
+              path: "examples/mnist/test_data.bin"
+              batchsize: 100
+              shape: 784
+              std_value: 127.5
+              mean_value: 127.5
+             }
+             exclude: kTrain
+          }
+
+
+* All [InnerProductLayer](layer.html#innerproductlayer)s are configured similarly as,
+
+        layer{
+          name: "fc1"
+          type: kInnerProduct
+          srclayers:"data"
+          innerproduct_conf{
+            num_output: 2500
+          }
+          param{
+            name: "w1"
+            ...
+          }
+          param{
+            name: "b1"
+            ..
+          }
+        }
+
+    with the `num_output` decreasing from 2500 to 10.
+
+* A [STanhLayer](layer.html#stanhlayer) is connected to every InnerProductLayer
+except the last one. It transforms the feature via scaled tanh function.
+
+        layer{
+          name: "tanh1"
+          type: kSTanh
+          srclayers:"fc1"
+        }
+
+* The final [Softmax loss layer](layer.html#softmaxloss) connects
+to LabelLayer and the last STanhLayer.
+
+        layer{
+          name: "loss"
+          type:kSoftmaxLoss
+          softmaxloss_conf{ topk:1 }
+          srclayers:"fc6"
+          srclayers:"data"
+        }
+
+### Updater
+
+The [normal SGD updater](updater.html#updater) is selected.
+The learning rate shrinks by 0.997 every 60 steps (i.e., one epoch).
+
+    updater{
+      type: kSGD
+      learning_rate{
+        base_lr: 0.001
+        type : kStep
+        step_conf{
+          change_freq: 60
+          gamma: 0.997
+        }
+      }
+    }
+
+### TrainOneBatch algorithm
+
+The MLP model is a feed-forward model, hence
+[Back-propagation algorithm](train-one-batch#back-propagation)
+is selected.
+
+    train_one_batch {
+      alg: kBP
+    }
+
+### Cluster setting
+
+The following configuration set a single worker and server for training.
+[Training frameworks](frameworks.html) page introduces configurations of a couple of distributed
+training frameworks.
+
+    cluster {
+      nworker_groups: 1
+      nserver_groups: 1
+    }

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/model-config.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/model-config.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/model-config.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/model-config.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,294 @@
+# Model Configuration
+
+---
+
+SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters
+of deep learning models.  For each SGD iteration, there is a
+[Worker](architecture.html) computing
+gradients of parameters from the NeuralNet and a [Updater]() updating parameter
+values based on gradients. Hence the model configuration mainly consists these
+three parts. We will introduce the NeuralNet, Worker and Updater in the
+following paragraphs and describe the configurations for them. All model
+configuration is specified in the model.conf file in the user provided
+workspace folder. E.g., the [cifar10 example folder](https://github.com/apache/incubator-singa/tree/master/examples/cifar10)
+has a model.conf file.
+
+
+## NeuralNet
+
+### Uniform model (neuralnet) representation
+
+<img src = "../images/model-categorization.png" style = "width: 400px"> Fig. 1:
+Deep learning model categorization</img>
+
+Many deep learning models have being proposed. Fig. 1 is a categorization of
+popular deep learning models based on the layer connections. The
+[NeuralNet](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h)
+abstraction of SINGA consists of multiple directly connected layers. This
+abstraction is able to represent models from all the three categorizations.
+
+  * For the feed-forward models, their connections are already directed.
+
+  * For the RNN models, we unroll them into directed connections, as shown in
+  Fig. 2.
+
+  * For the undirected connections in RBM, DBM, etc., we replace each undirected
+  connection with two directed connection, as shown in Fig. 3.
+
+<div style = "height: 200px">
+<div style = "float:left; text-align: center">
+<img src = "../images/unroll-rbm.png" style = "width: 280px"> <br/>Fig. 2: Unroll RBM </img>
+</div>
+<div style = "float:left; text-align: center; margin-left: 40px">
+<img src = "../images/unroll-rnn.png" style = "width: 550px"> <br/>Fig. 3: Unroll RNN </img>
+</div>
+</div>
+
+In specific, the NeuralNet class is defined in
+[neuralnet.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h) :
+
+    ...
+    vector<Layer*> layers_;
+    ...
+
+The Layer class is defined in
+[base_layer.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/base_layer.h):
+
+    vector<Layer*> srclayers_, dstlayers_;
+    LayerProto layer_proto_;  // layer configuration, including meta info, e.g., name
+    ...
+
+
+The connection with other layers are kept in the `srclayers_` and `dstlayers_`.
+Since there are many different feature transformations, there are many
+different Layer implementations correspondingly. For layers that have
+parameters in their feature transformation functions, they would have Param
+instances in the layer class, e.g.,
+
+    Param weight;
+
+
+### Configure the structure of a NeuralNet instance
+
+To train a deep learning model, the first step is to write the configurations
+for the model structure, i.e., the layers and connections for the NeuralNet.
+Like [Caffe](http://caffe.berkeleyvision.org/), we use the [Google Protocol
+Buffer](https://developers.google.com/protocol-buffers/) to define the
+configuration protocol. The
+[NetProto](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto)
+specifies the configuration fields for a NeuralNet instance,
+
+message NetProto {
+  repeated LayerProto layer = 1;
+  ...
+}
+
+The configuration is then
+
+    layer {
+      // layer configuration
+    }
+    layer {
+      // layer configuration
+    }
+    ...
+
+To configure the model structure, we just configure each layer involved in the model.
+
+    message LayerProto {
+      // the layer name used for identification
+      required string name = 1;
+      // source layer names
+      repeated string srclayers = 3;
+      // parameters, e.g., weight matrix or bias vector
+      repeated ParamProto param = 12;
+      // the layer type from the enum above
+      required LayerType type = 20;
+      // configuration for convolution layer
+      optional ConvolutionProto convolution_conf = 30;
+      // configuration for concatenation layer
+      optional ConcateProto concate_conf = 31;
+      // configuration for dropout layer
+      optional DropoutProto dropout_conf = 33;
+      ...
+    }
+
+A sample configuration for a feed-forward model is like
+
+    layer {
+      name : "input"
+      type : kRecordInput
+    }
+    layer {
+      name : "conv"
+      type : kInnerProduct
+      srclayers : "input"
+      param {
+        // configuration for parameter
+      }
+      innerproduct_conf {
+        // configuration for this specific layer
+      }
+      ...
+    }
+
+The layer type list is defined in
+[LayerType](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto).
+One type (kFoo) corresponds to one child class of Layer (FooLayer) and one
+configuration field (foo_conf). All built-in layers are introduced in the [layer page](layer.html).
+
+## Worker
+
+At the beginning, the Work will initialize the values of Param instances of
+each layer either randomly (according to user configured distribution) or
+loading from a [checkpoint file]().  For each training iteration, the worker
+visits layers of the neural network to compute gradients of Param instances of
+each layer. Corresponding to the three categories of models, there are three
+different algorithm to compute the gradients of a neural network.
+
+  1. Back-propagation (BP) for feed-forward models
+  2. Back-propagation through time (BPTT) for recurrent neural networks
+  3. Contrastive divergence (CD) for RBM, DBM, etc models.
+
+SINGA has provided these three algorithms as three Worker implementations.
+Users only need to configure in the model.conf file to specify which algorithm
+should be used. The configuration protocol is
+
+    message ModelProto {
+      ...
+      enum GradCalcAlg {
+      // BP algorithm for feed-forward models, e.g., CNN, MLP, RNN
+      kBP = 1;
+      // BPTT for recurrent neural networks
+      kBPTT = 2;
+      // CD algorithm for RBM, DBM etc., models
+      kCd = 3;
+      }
+      // gradient calculation algorithm
+      required GradCalcAlg alg = 8 [default = kBackPropagation];
+      ...
+    }
+
+These algorithms override the TrainOneBatch function of the Worker. E.g., the
+BPWorker implements it as
+
+    void BPWorker::TrainOneBatch(int step, Metric* perf) {
+      Forward(step, kTrain, train_net_, perf);
+      Backward(step, train_net_);
+    }
+
+The Forward function passes the raw input features of one mini-batch through
+all layers, and the Backward function visits the layers in reverse order to
+compute the gradients of the loss w.r.t each layer's feature and each layer's
+Param objects. Different algorithms would visit the layers in different orders.
+Some may traverses the neural network multiple times, e.g., the CDWorker's
+TrainOneBatch function is:
+
+    void CDWorker::TrainOneBatch(int step, Metric* perf) {
+      PostivePhase(step, kTrain, train_net_, perf);
+      NegativePhase(step, kTran, train_net_, perf);
+      GradientPhase(step, train_net_);
+    }
+
+Each `*Phase` function would visit all layers one or multiple times.
+All algorithms will finally call two functions of the Layer class:
+
+     /**
+      * Transform features from connected layers into features of this layer.
+      *
+      * @param phase kTrain, kTest, kPositive, etc.
+      */
+     virtual void ComputeFeature(Phase phase, Metric* perf) = 0;
+     /**
+      * Compute gradients for parameters (and connected layers).
+      *
+      * @param phase kTrain, kTest, kPositive, etc.
+      */
+     virtual void ComputeGradient(Phase phase) = 0;
+
+All [Layer implementations]() must implement the above two functions.
+
+
+## Updater
+
+Once the gradients of parameters are computed, the Updater will update
+parameter values.  There are many SGD variants for updating parameters, like
+[AdaDelta](http://arxiv.org/pdf/1212.5701v1.pdf),
+[AdaGrad](http://www.magicbroom.info/Papers/DuchiHaSi10.pdf),
+[RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf),
+[Nesterov](http://scholar.google.com/citations?view_op=view_citation&amp;hl=en&amp;user=DJ8Ep8YAAAAJ&amp;citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C)
+and SGD with momentum. The core functions of the Updater is
+
+    /**
+     * Update parameter values based on gradients
+     * @param step training step
+     * @param param pointer to the Param object
+     * @param grad_scale scaling factor for the gradients
+     */
+    void Update(int step, Param* param, float grad_scale=1.0f);
+    /**
+     * @param step training step
+     * @return the learning rate for this step
+     */
+    float GetLearningRate(int step);
+
+SINGA provides several built-in updaters and learning rate change methods.
+Users can configure them according to the UpdaterProto
+
+    message UpdaterProto {
+      enum UpdaterType{
+        // noraml SGD with momentum and weight decay
+        kSGD = 1;
+        // adaptive subgradient, http://www.magicbroom.info/Papers/DuchiHaSi10.pdf
+        kAdaGrad = 2;
+        // http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
+        kRMSProp = 3;
+        // Nesterov first optimal gradient method
+        kNesterov = 4;
+      }
+      // updater type
+      required UpdaterType type = 1 [default=kSGD];
+      // configuration for RMSProp algorithm
+      optional RMSPropProto rmsprop_conf = 50;
+
+      enum ChangeMethod {
+        kFixed = 0;
+        kInverseT = 1;
+        kInverse = 2;
+        kExponential = 3;
+        kLinear = 4;
+        kStep = 5;
+        kFixedStep = 6;
+      }
+      // change method for learning rate
+      required ChangeMethod lr_change= 2 [default = kFixed];
+
+      optional FixedStepProto fixedstep_conf=40;
+      ...
+      optional float momentum = 31 [default = 0];
+      optional float weight_decay = 32 [default = 0];
+      // base learning rate
+      optional float base_lr = 34 [default = 0];
+    }
+
+
+## Other model configuration fields
+
+Some other important configuration fields for training a deep learning model is
+listed:
+
+    // model name, e.g., "cifar10-dcnn", "mnist-mlp"
+    string name;
+    // displaying training info for every this number of iterations, default is 0
+    int32 display_freq;
+    // total num of steps/iterations for training
+    int32 train_steps;
+    // do test for every this number of training iterations, default is 0
+    int32 test_freq;
+    // run test for this number of steps/iterations, default is 0.
+    // The test dataset has test_steps * batchsize instances.
+    int32 test_steps;
+    // do checkpoint for every this number of training steps, default is 0
+    int32 checkpoint_freq;
+
+The pages of [checkpoint and restore](checkpoint.html) has details on checkpoint related fields.

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/neural-net.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/neural-net.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/neural-net.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/neural-net.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,327 @@
+# Neural Net
+
+---
+
+`NeuralNet` in SINGA represents an instance of user's neural net model. As the
+neural net typically consists of a set of layers, `NeuralNet` comprises
+a set of unidirectionally connected [Layer](layer.html)s.
+This page describes how to convert an user's neural net into
+the configuration of `NeuralNet`.
+
+<img src="../images/model-category.png" align="center" width="200px"/>
+<span><strong>Figure 1 - Categorization of popular deep learning models.</strong></span>
+
+## Net structure configuration
+
+Users configure the `NeuralNet` by listing all layers of the neural net and
+specifying each layer's source layer names. Popular deep learning models can be
+categorized as Figure 1. The subsequent sections give details for each
+category.
+
+### Feed-forward models
+
+<div align = "left">
+<img src="../images/mlp-net.png" align="center" width="200px"/>
+<span><strong>Figure 2 - Net structure of a MLP model.</strong></span>
+</div>
+
+Feed-forward models, e.g., CNN and MLP, can easily get configured as their layer
+connections are undirected without circles. The
+configuration for the MLP model shown in Figure 1 is as follows,
+
+    net {
+      layer {
+        name : 'data"
+        type : kData
+      }
+      layer {
+        name : 'image"
+        type : kImage
+        srclayer: 'data'
+      }
+      layer {
+        name : 'label"
+        type : kLabel
+        srclayer: 'data'
+      }
+      layer {
+        name : 'hidden"
+        type : kHidden
+        srclayer: 'image'
+      }
+      layer {
+        name : 'softmax"
+        type : kSoftmaxLoss
+        srclayer: 'hidden'
+        srclayer: 'label'
+      }
+    }
+
+### Energy models
+
+<img src="../images/rbm-rnn.png" align="center" width="500px"/>
+<span><strong>Figure 3 - Convert connections in RBM and RNN.</strong></span>
+
+
+For energy models including RBM, DBM,
+etc., their connections are undirected (i.e., Category B). To represent these models using
+`NeuralNet`, users can simply replace each connection with two directed
+connections, as shown in Figure 3a. In other words, for each pair of connected layers, their source
+layer field should include each other's name.
+The full [RBM example](rbm.html) has
+detailed neural net configuration for a RBM model, which looks like
+
+    net {
+      layer {
+        name : "vis"
+        type : kVisLayer
+        param {
+          name : "w1"
+        }
+        srclayer: "hid"
+      }
+      layer {
+        name : "hid"
+        type : kHidLayer
+        param {
+          name : "w2"
+          share_from: "w1"
+        }
+        srclayer: "vis"
+      }
+    }
+
+### RNN models
+
+For recurrent neural networks (RNN), users can remove the recurrent connections
+by unrolling the recurrent layer.  For example, in Figure 3b, the original
+layer is unrolled into a new layer with 4 internal layers. In this way, the
+model is like a normal feed-forward model, thus can be configured similarly.
+The [RNN example](rnn.html) has a full neural net
+configuration for a RNN model.
+
+
+## Configuration for multiple nets
+
+Typically, a training job includes three neural nets for
+training, validation and test phase respectively. The three neural nets share most
+layers except the data layer, loss layer or output layer, etc..  To avoid
+redundant configurations for the shared layers, users can uses the `exclude`
+filed to filter a layer in the neural net, e.g., the following layer will be
+filtered when creating the testing `NeuralNet`.
+
+
+    layer {
+      ...
+      exclude : kTest # filter this layer for creating test net
+    }
+
+
+
+## Neural net partitioning
+
+A neural net can be partitioned in different ways to distribute the training
+over multiple workers.
+
+### Batch and feature dimension
+
+<img src="../images/partition_fc.png" align="center" width="400px"/>
+<span><strong>Figure 4 - Partitioning of a fully connected layer.</strong></span>
+
+
+Every layer's feature blob is considered a matrix whose rows are feature
+vectors. Thus, one layer can be split on two dimensions. Partitioning on
+dimension 0 (also called batch dimension) slices the feature matrix by rows.
+For instance, if the mini-batch size is 256 and the layer is partitioned into 2
+sub-layers, each sub-layer would have 128 feature vectors in its feature blob.
+Partitioning on this dimension has no effect on the parameters, as every
+[Param](param.html) object is replicated in the sub-layers. Partitioning on dimension
+1 (also called feature dimension) slices the feature matrix by columns. For
+example, suppose the original feature vector has 50 units, after partitioning
+into 2 sub-layers, each sub-layer would have 25 units. This partitioning may
+result in [Param](param.html) object being split, as shown in
+Figure 4. Both the bias vector and weight matrix are
+partitioned into two sub-layers.
+
+
+### Partitioning configuration
+
+There are 4 partitioning schemes, whose configurations are give below,
+
+  1. Partitioning each singe layer into sub-layers on batch dimension (see
+  below). It is enabled by configuring the partition dimension of the layer to
+  0, e.g.,
+
+          # with other fields omitted
+          layer {
+            partition_dim: 0
+          }
+
+  2. Partitioning each singe layer into sub-layers on feature dimension (see
+  below).  It is enabled by configuring the partition dimension of the layer to
+  1, e.g.,
+
+          # with other fields omitted
+          layer {
+            partition_dim: 1
+          }
+
+  3. Partitioning all layers into different subsets. It is enabled by
+  configuring the location ID of a layer, e.g.,
+
+          # with other fields omitted
+          layer {
+            location: 1
+          }
+          layer {
+            location: 0
+          }
+
+
+  4. Hybrid partitioning of strategy 1, 2 and 3. The hybrid partitioning is
+  useful for large models. An example application is to implement the
+  [idea proposed by Alex](http://arxiv.org/abs/1404.5997).
+  Hybrid partitioning is configured like,
+
+          # with other fields omitted
+          layer {
+            location: 1
+          }
+          layer {
+            location: 0
+          }
+          layer {
+            partition_dim: 0
+            location: 0
+          }
+          layer {
+            partition_dim: 1
+            location: 0
+          }
+
+Currently SINGA supports strategy-2 well. Other partitioning strategies are
+are under test and will be released in later version.
+
+## Parameter sharing
+
+Parameters can be shared in two cases,
+
+  * sharing parameters among layers via user configuration. For example, the
+  visible layer and hidden layer of a RBM shares the weight matrix, which is configured through
+  the `share_from` field as shown in the above RBM configuration. The
+  configurations must be the same (except name) for shared parameters.
+
+  * due to neural net partitioning, some `Param` objects are replicated into
+  different workers, e.g., partitioning one layer on batch dimension. These
+  workers share parameter values. SINGA controls this kind of parameter
+  sharing automatically, users do not need to do any configuration.
+
+  * the `NeuralNet` for training and testing (and validation) share most layers
+  , thus share `Param` values.
+
+If the shared `Param` instances resident in the same process (may in different
+threads), they use the same chunk of memory space for their values. But they
+would have different memory spaces for their gradients. In fact, their
+gradients will be averaged by the stub or server.
+
+## Advanced user guide
+
+### Creation
+
+    static NeuralNet* NeuralNet::Create(const NetProto& np, Phase phase, int num);
+
+The above function creates a `NeuralNet` for a given phase, and returns a
+pointer to the `NeuralNet` instance. The phase is in {kTrain,
+kValidation, kTest}. `num` is used for net partitioning which indicates the
+number of partitions.  Typically, a training job includes three neural nets for
+training, validation and test phase respectively. The three neural nets share most
+layers except the data layer, loss layer or output layer, etc.. The `Create`
+function takes in the full net configuration including layers for training,
+validation and test.  It removes layers for phases other than the specified
+phase based on the `exclude` field in
+[layer configuration](layer.html):
+
+    layer {
+      ...
+      exclude : kTest # filter this layer for creating test net
+    }
+
+The filtered net configuration is passed to the constructor of `NeuralNet`:
+
+    NeuralNet::NeuralNet(NetProto netproto, int npartitions);
+
+The constructor creates a graph representing the net structure firstly in
+
+    Graph* NeuralNet::CreateGraph(const NetProto& netproto, int npartitions);
+
+Next, it creates a layer for each node and connects layers if their nodes are
+connected.
+
+    void NeuralNet::CreateNetFromGraph(Graph* graph, int npartitions);
+
+Since the `NeuralNet` instance may be shared among multiple workers, the
+`Create` function returns a pointer to the `NeuralNet` instance .
+
+### Parameter sharing
+
+ `Param` sharing
+is enabled by first sharing the Param configuration (in `NeuralNet::Create`)
+to create two similar (e.g., the same shape) Param objects, and then calling
+(in `NeuralNet::CreateNetFromGraph`),
+
+    void Param::ShareFrom(const Param& from);
+
+It is also possible to share `Param`s of two nets, e.g., sharing parameters of
+the training net and the test net,
+
+    void NeuralNet:ShareParamsFrom(NeuralNet* other);
+
+It will call `Param::ShareFrom` for each Param object.
+
+### Access functions
+`NeuralNet` provides a couple of access function to get the layers and params
+of the net:
+
+    const std::vector<Layer*>& layers() const;
+    const std::vector<Param*>& params() const ;
+    Layer* name2layer(string name) const;
+    Param* paramid2param(int id) const;
+
+
+### Partitioning
+
+
+#### Implementation
+
+SINGA partitions the neural net in `CreateGraph` function, which creates one
+node for each (partitioned) layer. For example, if one layer's partition
+dimension is 0 or 1, then it creates `npartition` nodes for it; if the
+partition dimension is -1, a single node is created, i.e., no partitioning.
+Each node is assigned a partition (or location) ID. If the original layer is
+configured with a location ID, then the ID is assigned to each newly created node.
+These nodes are connected according to the connections of the original layers.
+Some connection layers will be added automatically.
+For instance, if two connected sub-layers are located at two
+different workers, then a pair of bridge layers is inserted to transfer the
+feature (and gradient) blob between them. When two layers are partitioned on
+different dimensions, a concatenation layer which concatenates feature rows (or
+columns) and a slice layer which slices feature rows (or columns) would be
+inserted. These connection layers help making the network communication and
+synchronization transparent to the users.
+
+#### Dispatching partitions to workers
+
+Each (partitioned) layer is assigned a location ID, based on which it is dispatched to one
+worker. Particularly, the pointer to the `NeuralNet` instance is passed
+to every worker within the same group, but each worker only computes over the
+layers that have the same partition (or location) ID as the worker's ID.  When
+every worker computes the gradients of the entire model parameters
+(strategy-2), we refer to this process as data parallelism.  When different
+workers compute the gradients of different parameters (strategy-3 or
+strategy-1), we call this process model parallelism.  The hybrid partitioning
+leads to hybrid parallelism where some workers compute the gradients of the
+same subset of model parameters while other workers compute on different model
+parameters.  For example, to implement the hybrid parallelism in for the
+[DCNN model](http://arxiv.org/abs/1404.5997), we set `partition_dim = 0` for
+lower layers and `partition_dim = 1` for higher layers.
+

Added: incubator/singa/site/trunk/content/markdown/v0.2.0/jp/neuralnet-partition.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.2.0/jp/neuralnet-partition.md?rev=1738695&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.2.0/jp/neuralnet-partition.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.2.0/jp/neuralnet-partition.md Tue Apr 12 06:22:20 2016
@@ -0,0 +1,54 @@
+# Neural Net Partition
+
+---
+
+The purposes of partitioning neural network is to distribute the partitions onto
+different working units (e.g., threads or nodes, called workers in this article)
+and parallelize the processing.
+Another reason for partition is to handle large neural network which cannot be
+hold in a single node. For instance, to train models against images with high
+resolution we need large neural networks (in terms of training parameters).
+
+Since *Layer* is the first class citizen in SIGNA, we do the partition against
+layers. Specifically, we support partitions at two levels. First, users can configure
+the location (i.e., worker ID) of each layer. In this way, users assign one worker
+for each layer. Secondly, for one layer, we can partition its neurons or partition
+the instances (e.g, images). They are called layer partition and data partition
+respectively. We illustrate the two types of partitions using an simple convolutional neural network.
+
+<img src="../images/conv-mnist.png" style="width: 220px"/>
+
+The above figure shows a convolutional neural network without any partition. It
+has 8 layers in total (one rectangular represents one layer). The first layer is
+DataLayer (data) which reads data from local disk files/databases (or HDFS). The second layer
+is a MnistLayer which parses the records from MNIST data to get the pixels of a batch
+of 8 images (each image is of size 28x28). The LabelLayer (label) parses the records to get the label
+of each image in the batch. The ConvolutionalLayer (conv1) transforms the input image to the
+shape of 8x27x27. The ReLULayer (relu1) conducts elementwise transformations. The PoolingLayer (pool1)
+sub-samples the images. The fc1 layer is fully connected with pool1 layer. It
+mulitplies each image with a weight matrix to generate a 10 dimension hidden feature which
+is then normalized by a SoftmaxLossLayer to get the prediction.
+
+<img src="../images/conv-mnist-datap.png" style="width: 1000px"/>
+
+The above figure shows the convolutional neural network after partitioning all layers
+except the DataLayer and ParserLayers, into 3 partitions using data partition.
+The read layers process 4 images of the batch, the black and blue layers process 2 images
+respectively. Some helper layers, i.e., SliceLayer, ConcateLayer, BridgeSrcLayer,
+BridgeDstLayer and SplitLayer, are added automatically by our partition algorithm.
+Layers of the same color resident in the same worker. There would be data transferring
+across different workers at the boundary layers (i.e., BridgeSrcLayer and BridgeDstLayer),
+e.g., between s-slice-mnist-conv1 and d-slice-mnist-conv1.
+
+<img src="../images/conv-mnist-layerp.png" style="width: 1000px"/>
+
+The above figure shows the convolutional neural network after partitioning all layers
+except the DataLayer and ParserLayers, into 2 partitions using layer partition. We can
+see that each layer processes all 8 images from the batch. But different partitions process
+different part of one image. For instance, the layer conv1-00 process only 4 channels. The other
+4 channels are processed by conv1-01 which residents in another worker.
+
+
+Since the partition is done at the layer level, we can apply different partitions for
+different layers to get a hybrid partition for the whole neural network. Moreover,
+we can also specify the layer locations to locate different layers to different workers.



Mime
View raw message