singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prasanna Balaprakash <pbala...@mcs.anl.gov>
Subject Training SINGA with a cluster
Date Sun, 10 Jul 2016 20:04:33 GMT
Dear developers,

I am trying to run SINGA in a cluster environment with ~100 hybrid (CPU+GPU) nodes.


I started with single node experiment. 

As per the instruction, in my COBALT job script, I use "cat $COBALT_NODEFILE > conf/hostfile”,
where $COBALT_NODEFILE in the COBALT will give the list of nodes allocated.

I am not sure how to set the zookeeper location! 

Also, how to verify if GPU is used:

E0710 18:39:23.837704 72213 cluster.cc:50] proc #0 -> localhost:0 (pid = 72213) 
E0710 18:39:23.898723 72241 server.cc:64] Server (group = 0, id = 0) start 
E0710 18:39:24.898967 72242 worker.cc:79] Worker (group = 0, id = 0) start on CPU

From this log file it seems only CPU is on used. 

Thanks
Prasanna



Mime
View raw message