Return-Path: X-Original-To: apmail-singa-commits-archive@minotaur.apache.org Delivered-To: apmail-singa-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 63AA719C34 for ; Tue, 12 Apr 2016 06:25:10 +0000 (UTC) Received: (qmail 55948 invoked by uid 500); 12 Apr 2016 06:25:10 -0000 Delivered-To: apmail-singa-commits-archive@singa.apache.org Received: (qmail 55925 invoked by uid 500); 12 Apr 2016 06:25:10 -0000 Mailing-List: contact commits-help@singa.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@singa.incubator.apache.org Delivered-To: mailing list commits@singa.incubator.apache.org Received: (qmail 55915 invoked by uid 99); 12 Apr 2016 06:25:10 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Apr 2016 06:25:10 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id C031E1A0191 for ; Tue, 12 Apr 2016 06:25:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.799 X-Spam-Level: * X-Spam-Status: No, score=1.799 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id BMrLPosUk-3F for ; Tue, 12 Apr 2016 06:24:59 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTP id BE11D5FAD8 for ; Tue, 12 Apr 2016 06:24:58 +0000 (UTC) Received: from svn01-us-west.apache.org (svn.apache.org [10.41.0.6]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B8124E0FCD for ; Tue, 12 Apr 2016 06:24:56 +0000 (UTC) Received: from svn01-us-west.apache.org (localhost [127.0.0.1]) by svn01-us-west.apache.org (ASF Mail Server at svn01-us-west.apache.org) with ESMTP id AD4AF3A0216 for ; Tue, 12 Apr 2016 06:24:56 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r985457 [9/35] - in /websites/staging/singa/trunk/content: ./ community/ develop/ docs/ docs/jp/ docs/kr/ docs/zh/ releases/ v0.1.0/ v0.2.0/ v0.2.0/jp/ v0.2.0/kr/ v0.2.0/zh/ Date: Tue, 12 Apr 2016 06:24:54 -0000 To: commits@singa.incubator.apache.org From: buildbot@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20160412062456.AD4AF3A0216@svn01-us-west.apache.org> Added: websites/staging/singa/trunk/content/v0.2.0/gpu.html ============================================================================== --- websites/staging/singa/trunk/content/v0.2.0/gpu.html (added) +++ websites/staging/singa/trunk/content/v0.2.0/gpu.html Tue Apr 12 06:24:50 2016 @@ -0,0 +1,399 @@ + + + + + + + + + Apache SINGA – Training on GPU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Fork me on GitHub + + + + + + + + + +
+ + + + + +
+
+ +
+ + +
+ +

Training on GPU

+
+

Considering GPU is much faster than CPU for linear algebra operations, it is essential to support the training of deep learning models (which involves a lot of linear algebra operations) on GPU cards. SINGA now supports training on a single node (i.e., process) with multiple GPU cards. Training in a GPU cluster with multiple nodes is under development.

+
+

Instructions

+
+

Compilation

+

To enable the training on GPU, you need to compile SINGA with CUDA from Nvidia,

+ +
+
./configure --enable-cuda --with-cuda=<path to cuda folder>
+
+

In addition, if you want to use the CUDNN library for convolutional neural network provided by Nvidia, you need to enable CUDNN,

+ +
+
./configure --enable-cuda --with-cuda=<path to cuda folder> --enable-cudnn --with-cudnn=<path to cudnn folder>
+
+

SINGA now supports CUDNN V3.0.

+
+

Configuration

+

The job configuration for GPU training is similar to that for training on CPU. There is one more field to configure, gpu, which indicate the device ID of the GPU you want to use. The simplest configuration is

+ +
+
# job.conf
+...
+gpu: 0
+...
+
+

This configuration will run the worker on GPU 0. If you want to launch multiple workers, each on a separate GPU, you can configure it as

+ +
+
# job.conf
+...
+gpu: 0
+gpu: 2
+...
+cluster {
+  nworkers_per_group: 2
+  nworkers_per_process: 2
+}
+
+

Using the above configuration, SINGA would partition each mini-batch evenly onto two workers which run on GPU 0 and GPU 2 respectively. For more information on running multiple workers in a single node, please refer to Training Framework. Please be careful to configure the same number of workers and number of gpus. Otherwise some workers would run on GPU and the rest would run on CPU. This kind of hybrid training is not well supported for now.

+

For some layers, their implementation is transparent to GPU/CPU, like the InnerProductLayer GRULayer, ReLULayer, etc. Hence, you can use the same configuration for these layers to run on GPU or CPU. For other layers, especially the layers involved in ConvNet, SINGA uses different implementations for GPU and CPU. Particularly, the GPU version is implemented using CUDNN library. To train a ConvNet on GPU, you configure the layers as

+ +
+
layer {
+  type: kCudnnConv
+  ...
+}
+layer {
+  type: kCudnnPool
+  ...
+}
+
+

The cifar10 example and Alexnet example have complete configurations for ConvNet.

+
+

Implementation details

+

SINGA implements the GPU training by assigning each worker a GPU device at the beginning of training (by the Driver class). Then the work can call GPU functions and run them on the assigned GPU. GPU is typically used for linear algebra computation in layer functions, because GPU is good at such computation. There is a Context singleton, which stores the handles and random generators for each device. The layer code should detect its running device and then call the CPU or GPU functions correspondingly.

+

To make the layer implementation easier SINGA provides some linear algebra functions (in math_blob.h), which are transparent to the running device for users. Internally, they query the Context singleton to get the device information and call CPU or GPU to do the computation. Consequently, users can implement layers without awareness of the underlying running device.

+

If the functionality cannot be implemented using SINGA provided functions in math_blob.h, the layer code needs to handle the CPU and GPU devices explicitly by querying the Context singleton. For layers that cannot run on GPU, e.g., input/output layers and connection layers which have little computation but much IO or network workload, there is no need to consider the GPU device. When these layers are configured in a neural net, they will run on CPU (since they don’t call GPU functions).

+
+
+
+ +
+ +
+
+
+ +

Copyright © 2015 The Apache Software Foundation. All rights reserved. Apache Singa, Apache, the Apache feather logo, and the Apache Singa project logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.

+
+ + +
+
+ + Added: websites/staging/singa/trunk/content/v0.2.0/hdfs.html ============================================================================== --- websites/staging/singa/trunk/content/v0.2.0/hdfs.html (added) +++ websites/staging/singa/trunk/content/v0.2.0/hdfs.html Tue Apr 12 06:24:50 2016 @@ -0,0 +1,514 @@ + + + + + + + + + Apache SINGA – Using HDFS with SINGA + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Fork me on GitHub + + + + + + + + + +
+ + + + + +
+
+ +
+ + +
+ +

Using HDFS with SINGA

+

This guide explains how to make use of HDFS as the data store for SINGA jobs.

+ +
    + +
  1. Quick start using Docker
  2. + +
  3. Setup HDFS
  4. + +
  5. Examples
  6. +
+

+
+

Quick start using Docker

+

We provide a Docker container built on top of singa/mesos (see the guide on building SINGA on Docker).

+ +
+
git clone https://github.com/ug93tad/incubator-singa
+cd incubator-singa
+git checkout SINGA-97-docker
+cd tool/docker/hdfs
+sudo docker build -t singa/hdfs .
+
+

Once built, the container image singa/hdfs contains the installation of HDFS C++ library (libhdfs3) and the latest SINGA code. Many distributed nodes can be launched, and HDFS be set up, by following the guide for running distributed SINGA on Mesos.

+

In the following, we assume the HDFS setup with node0 being the namenode, and nodei (i>0) being the datanodes.

+

+
+

Setup HDFS

+

There are at least 2 C/C++ client libraries for interacting with HDFS. One is from Hadoop (libhdfs), which is a JNI-based library, meaning that communication will go through JVM. The other is libhdfs3 which is a native C++ library developed by Pivotal, in which the client communicate directly with HDFS via RPC. The current implementation uses the second one.

+ +
    + +
  1. +

    Install libhdfs3: follow the official guide.

  2. + +
  3. +

    Additional setup: recent versions of Hadoop (>2.4.x) support short-circuit local reads which bypass network communications (TCP sockets) when retrieving data at the local nodes. libhdfs3 will throws errors (but will still continue to work) when it finds that short-circuit read is not set. To deal with this complaints, and improve performance, add the following configuration to hdfs-site.xml and to hdfs-client.xml

    + +
    +
      <property>
    +<name>dfs.client.read.shortcircuit</name>
    +<value>true</value>
    +  </property>
    +  <property>
    +<name>dfs.domain.socket.path</name>
    +<value>/var/lib/hadoop-hdfs/dn_socket</value>
    +  </property>
    +
    +

    Next, at each client, set LIBHDFS3_CONF variable to point to hdfs-client.xml file:

    + +
    +
      export LIBHDFS3_CONF=$HADOOP_HOME/etc/hadoop/hdfs-client.xml
    +
  4. +
+

+
+

Examples

+

We explain how to run CIFAR10 and MNIST examples. Before training, the data must be uploaded to HDFS.

+
+

CIFAR10

+ +
    + +
  1. +

    Upload the data to HDFS (done at any of the HDFS nodes)

    + +
      + +
    • Change job.conf to use HDFS: in examples/cifar10/job.conf, set backend property to hdfsfile
    • + +
    • Create and upload data:
    • +
    + +
    +
    cd examples/cifar10
    +cp Makefile.example Makefile
    +make create
    +hadoop dfs -mkdir /examples/cifar10
    +hadoop dfs -copyFromLocal cifar-10-batches-bin /examples/cifar10/
    +
    +

    If successful, the files should be seen in HDFS via hadoop dfs -ls /examples/cifar10

  2. + +
  3. +

    Training:

    + +
      + +
    • Make sure conf/singa.conf has correct path to Zookeeper service:
    • +
    + +
    +
    zookeeper_host: "node0:2181"
    +
    + +
      + +
    • Make sure job.conf has correct paths to the train and test datasets:
    • +
    + +
    +
    // train layer
    +path: "hdfs://node0:9000/examples/cifar10/train_data.bin"
    +mean_file: "hdfs://node0:9000/examples/cifar10/image_mean.bin"
    +// test layer
    +path: "hdfs://node0:9000/examples/cifar10/test_data.bin"
    +mean_file: "hdfs://node0:9000/examples/cifar10/image_mean.bin"
    +
    + +
      + +
    • Start training: execute the following command at every node
    • +
    + +
    +
    ./singa -conf examples/cifar10/job.conf -singa_conf singa.conf -singa_job 0
    +
  4. +
+
+

MNIST

+ +
    + +
  1. +

    Upload the data to HDFS (done at any of the HDFS nodes)

    + +
      + +
    • Change job.conf to use HDFS: in examples/mnist/job.conf, set backend property to hdfsfile
    • + +
    • Create and upload data:
    • +
    + +
    +
    cd examples/mnist
    +cp Makefile.example Makefile
    +make create
    +make compile
    +./create_data.bin train-images-idx3-ubyte train-labels-idx1-ubyte hdfs://node0:9000/examples/mnist/train_data.bin
    +./create_data.bin t10k-images-idx3-ubyte t10k-labels-idx1-ubyte hdfs://node0:9000/examples/mnist/test_data.bin
    +
    +

    If successful, the files should be seen in HDFS via hadoop dfs -ls /examples/mnist

  2. + +
  3. +

    Training:

    + +
      + +
    • Make sure conf/singa.conf has correct path to Zookeeper service:
    • +
    + +
    +
    zookeeper_host: "node0:2181"
    +
    + +
      + +
    • Make sure job.conf has correct paths to the train and test datasets:
    • +
    + +
    +
    // train layer
    +path: "hdfs://node0:9000/examples/mnist/train_data.bin"
    +// test layer
    +path: "hdfs://node0:9000/examples/mnist/test_data.bin"
    +
    + +
      + +
    • Start training: execute the following command at every node
    • +
    + +
    +
    ./singa -conf examples/mnist/job.conf -singa_conf singa.conf -singa_job 0
    +
  4. +
+
+
+
+ +
+ +
+
+
+ +

Copyright © 2015 The Apache Software Foundation. All rights reserved. Apache Singa, Apache, the Apache feather logo, and the Apache Singa project logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.

+
+ + +
+
+ + Added: websites/staging/singa/trunk/content/v0.2.0/hybrid.html ============================================================================== --- websites/staging/singa/trunk/content/v0.2.0/hybrid.html (added) +++ websites/staging/singa/trunk/content/v0.2.0/hybrid.html Tue Apr 12 06:24:50 2016 @@ -0,0 +1,423 @@ + + + + + + + + + Apache SINGA – Hybrid Parallelism + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Fork me on GitHub + + + + + + + + + +
+ + + + + +
+
+ +
+ + +
+ +

Hybrid Parallelism

+
+
+

User Guide

+

SINGA supports different parallelism options for distributed training. Users just need to configure it in the job configuration.

+

Both NetProto and LayerProto have a field partition_dim to control the parallelism option:

+ +
    + +
  • partition_dim=0: neuralnet/layer is partitioned on data dimension, i.e., each worker processes a subset of data records.
  • + +
  • partition_dim=1: neuralnet/layer is partitioned on feature dimension, i.e., each worker maintains a subset of feature parameters.
  • +
+

partition_dim field in NetProto will be applied to all layers, unless a layer has its own partition_dim field set.

+

If we want data parallelism for the whole model, just leave partition_dim as default (which is 0), or configure the job.conf like:

+ +
+
neuralnet {
+  partition_dim: 0
+  layer {
+    name: ... 
+    type: ...
+  }
+  ...
+}
+
+

With the hybrid parallelism, we can have layers either partitioned on data dimension or feature dimension. For example, if we want a specific layer partitioned on feature dimension, just configure like:

+ +
+
neuralnet {
+  partition_dim: 0
+  layer {
+    name: "layer1_partition_on_data_dimension"
+    type: ...
+  }
+  layer {
+    name: "layer2_partition_on_feature_dimension"
+    type: ...
+    partition_dim: 1
+  }
+  ...
+}
+
+
+

Developer Guide

+

To support hybrid parallelism, after singa read users’ model and paration configuration, a set of connection layers are automatically added between layers when needed:

+ +
    + +
  • +

    BridgeSrcLayer & BridgeDstLayer are added when two connected layers are not in the same machine. They are paired and are responsible for sending data/gradient to the other side during each iteration.

  • + +
  • +

    ConcateLayer is added when there are multiple source layers. It combines their feature blobs along a given dimension.

  • + +
  • +

    SliceLayer is added when there are mutliple dest layers, each of which only needs a subset(slice) of this layers’ feature blob.

  • + +
  • +

    SplitLayer is added when there are multiple dest layers, each of which needs the whole feature blob.

  • +
+

Following is the logic used in our code to add connection layers:

+ +
+
Add Slice, Concate, Split Layers for Hybrid Partition
+
+All cases are as follows:
+src_pdim | dst_pdim | connection_type | Action
+    0    |     0    |     OneToOne    | Direct Connection
+    1    |     1    |     OneToOne    | Direct Connection
+    0    |     0    |     OneToAll    | Direct Connection
+    1    |     0    |     OneToOne    | Slice -> Concate
+    0    |     1    |     OneToOne    | Slice -> Concate
+    1    |     0    |     OneToAll    | Slice -> Concate
+    0    |     1    |     OneToAll    | Split -> Concate
+    1    |     1    |     OneToAll    | Split -> Concate
+
+Logic:
+dst_pdim = 1 && OneToAll ?
+  (YES) Split -> Concate
+  (NO)  src_pdim = dst_pdim ?
+          (YES) Direct Connection
+          (NO)  Slice -> Concate
+
+
+
+
+ +
+ +
+
+
+ +

Copyright © 2015 The Apache Software Foundation. All rights reserved. Apache Singa, Apache, the Apache feather logo, and the Apache Singa project logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.

+
+ + +
+
+ + Added: websites/staging/singa/trunk/content/v0.2.0/index.html ============================================================================== --- websites/staging/singa/trunk/content/v0.2.0/index.html (added) +++ websites/staging/singa/trunk/content/v0.2.0/index.html Tue Apr 12 06:24:50 2016 @@ -0,0 +1,398 @@ + + + + + + + + + Apache SINGA – Latest Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Fork me on GitHub + + + + + + + + + +
+ + + + + +
+
+ +
+ + + +
+
+ +
+ +
+
+
+ +

Copyright © 2015 The Apache Software Foundation. All rights reserved. Apache Singa, Apache, the Apache feather logo, and the Apache Singa project logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.

+
+ + +
+
+ + Added: websites/staging/singa/trunk/content/v0.2.0/installation.html ============================================================================== --- websites/staging/singa/trunk/content/v0.2.0/installation.html (added) +++ websites/staging/singa/trunk/content/v0.2.0/installation.html Tue Apr 12 06:24:50 2016 @@ -0,0 +1,349 @@ + + + + + + + + + Apache SINGA – Installation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Fork me on GitHub + + + + + + + + + +
+ + + + + +
+
+ +
+ + +
+ +

Installation

+
+

Currently, there are two ways to install SINGA: build directly from source, and build a Docker image.

+ + +
+
+
+ +
+ + + +