singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "wangwei (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SINGA-36) Refactor job configuration, driver program and scripts
Date Thu, 23 Jul 2015 07:05:05 GMT

     [ https://issues.apache.org/jira/browse/SINGA-36?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

wangwei updated SINGA-36:
-------------------------
    Description: 
Currently, we use the google protocol buffer to generate ClusterProto and ModelProto classes
for cluster (i.e., worker, server, group, etc) configuration and model (i.e., neuralnet, updater,
etc) configuration respectively. Theses classes provide functions to load/parse plain text
configuration files.

To make the naming more representative and simplify the configuration process, this ticket
will:

* merge cluster configuration and model configuration into a single job configuration (JobProto).

* move zookeeper configuration into conf/singa.conf which is global to all jobs. conf/hostfile
stores all available nodes.


The driver program is updated that users can register customized Layer implementations in
the driver program.
Once the job configuration is ready, the user submit the job via singa::SumbitJob() function.

Header files are merged into singa.h to simplify the driver program.

The arguments of singa-run.sh is updated that users pass the workspace (and resume option)
to it. The singa-run.sh uses the default driver executable (i.e., SINGA_ROOT/singa).
TODO enable users to pass their own driver executable to the script.

Some layers are put into optional_layer.h (.cc) because they depend on external libraries
(e.g., LMDB and OpenCV). TODO update the GNU make files, e.g., using with-feature=huge for
full compilation which checks all dependencies. Otherwise only check mandatory libraries.

Scripts for job management have minor changes, such as clean the log info.


  was:
Currently, we use the google protocol buffer to generate ClusterProto and ModelProto classes
for cluster (i.e., worker, server, group, etc) configuration and model (i.e., neuralnet, updater,
etc) configuration respectively. Theses classes provide functions to load/parse plain text
configuration files.

To make the naming more representative and simplify the configuration process, this ticket
will:

* move worker, server and group configuration into JobProto, i.e., these configurations are
job/application specific.

* move ModelProto as a field of JobProto, because the neuralnet configuration is also job
specific.

* move zookeeper, hostfile, etc configuration into ClusterProto, because these fields are
shared by all jobs.

The configuration for ClusterProto is done when installing SINGA and is done for only once.
The configuration file is fixed at $SINGA/conf/singa.conf.

The configuration for JobProto is done every time a new job is submitted. The conf file is
$workspace/job.conf. Users submit the job by passing -conf=$workspace to ./bin/singa-run.sh


> Refactor job configuration, driver program and scripts
> ------------------------------------------------------
>
>                 Key: SINGA-36
>                 URL: https://issues.apache.org/jira/browse/SINGA-36
>             Project: Singa
>          Issue Type: Improvement
>            Reporter: wangwei
>
> Currently, we use the google protocol buffer to generate ClusterProto and ModelProto
classes for cluster (i.e., worker, server, group, etc) configuration and model (i.e., neuralnet,
updater, etc) configuration respectively. Theses classes provide functions to load/parse plain
text configuration files.
> To make the naming more representative and simplify the configuration process, this ticket
will:
> * merge cluster configuration and model configuration into a single job configuration
(JobProto).
> * move zookeeper configuration into conf/singa.conf which is global to all jobs. conf/hostfile
stores all available nodes.
> The driver program is updated that users can register customized Layer implementations
in the driver program.
> Once the job configuration is ready, the user submit the job via singa::SumbitJob() function.

> Header files are merged into singa.h to simplify the driver program.
> The arguments of singa-run.sh is updated that users pass the workspace (and resume option)
to it. The singa-run.sh uses the default driver executable (i.e., SINGA_ROOT/singa).
> TODO enable users to pass their own driver executable to the script.
> Some layers are put into optional_layer.h (.cc) because they depend on external libraries
(e.g., LMDB and OpenCV). TODO update the GNU make files, e.g., using with-feature=huge for
full compilation which checks all dependencies. Otherwise only check mandatory libraries.
> Scripts for job management have minor changes, such as clean the log info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message