singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SINGA-81) Add Python Helper, which enables users to construct a model (JobProto) and run Singa in Python
Date Fri, 01 Jan 2016 12:29:39 GMT

    [ https://issues.apache.org/jira/browse/SINGA-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076286#comment-15076286
] 

ASF subversion and git services commented on SINGA-81:
------------------------------------------------------

Commit 7d43e27330581c3eecbd44a04f0c8691c3502ec6 in incubator-singa's branch refs/heads/master
from chonho
[ https://git-wip-us.apache.org/repos/asf?p=incubator-singa.git;h=7d43e27 ]

SINGA-81 Add Python Helper, which enables users to construct a model (JobProto) and run Singa
in Python

- Add wrapper API "Driver::Train(const std::string Job_conf)" for python.
- Add wrapper API "Driver::Test(const std::string Job_conf)" for python.

- Python codes (1) construct a model (JobProto), and (2) run singa

- Users are supposed to generate 'usermodel.py'
  . examples are provided, e.g., cifar10_cnn.py, mnist_mlp.py, mnist_rbm.py
  . 'cluster.conf' is required to maintain cluster information

- Users are supposed to run it as follows. e.g.,
  {code}
  cd SINGA_ROOT
  bin/singa-run.sh -conf tool/python/examples/cluster.conf -exe tool/python/examples/mnist_mlp.py
  {code}

- Note: in job.proto, 'required' rule of the following fields should be changed to 'optional'
  . JobProto: name, neuralnet, train_one_batch, updater, train_steps
  . ClusterProto: workspace
     . workspace field can be set in either (i) cluster.conf or (ii) python code

- __init__.py is required in the following directories
  . singa
  . singa/utils
  . singa/datasets
  . examples

- Add StoreResult() that takes care of training results
  . in SingaRun() called by fit() or evaluate()
  . read logfile
  . store accuracy, loss, ppl, se, etc. in dictionary format

- Parameter initialization
  . Parameter class is internally used
     . Weight follows gaussian distribution at default
     . Bias follows constant at default
  . As an option, users can explicitly specify parameter (e.g., *_parameter.py)

- Removed dataset/ae.py and dataset/rbm.py
  . RBM and Autoencoder examples use Mnist dataset


> Add Python Helper, which enables users to construct a model (JobProto) and run Singa
in Python
> ----------------------------------------------------------------------------------------------
>
>                 Key: SINGA-81
>                 URL: https://issues.apache.org/jira/browse/SINGA-81
>             Project: Singa
>          Issue Type: New Feature
>            Reporter: Lee Chonho
>
> Proposed design v1
> - (1) have a class named Builder
>    * (2) use Boost::parameter library (+ associated necessary header files)
> (1)
> Builder class implements api-like functions to configure JobProto including NetProto
(LayerProto), ClusterProto, UpdaterProto, etc.
> Two options
> a. users call Builder's functions in main.cc like
> {code}
> JobProto jobproto;
> Builder builder( &jobproto, "job name" );
> builder.xxx(xxx)  // add data layer
> builder.xxx(xxx)  // add parser layer
> ... etc. ...
> {code}
> b. we set main.cc like below and users call Builders functions in Construct() 
> {code}
> JobProto jobproto;
> Builder builder( &jobproto, "model name" );
> builder.Construct()
> {code}
> (2)
> Planning to use header-only files from Boost library
> - if the necessary files are small enough
> - because we can use "named arguments" feature with no restriction of # of arguments,
order, types.
> - because function will be intuitive, and adding users' own proto in a straightforward
way.
> Example is here. http://theboostcpplibraries.com/boost.parameter
> By following the example, we can do like
> {code}
> BOOST_PARAMETER_MEMBER_FUNCTION(   
>  (char*), AddLayerData,   tag,   
>  (required     
>    (type, (int))  (name, (char*))  (src, (char*))
>  ) 
>  (optional     
>    (path, (char*), *) (bsize (int) *)
>  ) )
>  {
>     ...  // set values
>     return name;
>  }
> {code}
> Then, users can add a datalayer by
> {code}
> L1 = builder.AddLayerData(kShardData, "data", null, _path="train_shard", _bsize=1000);
> {code}
> (TODO)
> - make use of google protobuf reflection for efficient parameter setting
> - need to avoid multiple calls for adding same/similar layers
> Any suggestion, design idea, comments please.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message