singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wang Wei <wang...@comp.nus.edu.sg>
Subject Re: Question about model parallelism and multi-GPUs
Date Wed, 11 Nov 2015 01:37:10 GMT
Hi Edward,

Thanks for stating the discussion.

On Wed, Nov 11, 2015 at 8:44 AM, Edward J. Yoon <edwardyoon@apache.org>
wrote:

> NYCTMI, I am having all doubts.. RNN model parallel described in
> tensorFlow whitepaper somewhat makes sense to me but did google
> actually uses Model parallel for CNN on multi nodes (equipped
> multi-devices) cluster. Blocking mat-mult on GPU appears to me slow
> and memory demanding. I mean it's possible but performance will
> suffer.
>
No. I didn't find the details on how they implement model parallelism for
fully connected layers.
I think for TensorFlow, model parallelism is done by users. Users specify
the sessions and the sub-graph of each session.

>
> So, I wanted to check how you approach this problem.
>
We are doing it by partitioning the layer into sub-layers. The online ppt
has more details.
The part is not fully completed, we will release it in the second version.
You can check the code then.

>
>
> On Wed, Nov 11, 2015 at 9:33 AM, ooibc <ooibc@comp.nus.edu.sg> wrote:
> >
> >
> > fyi, in case you are not in the dev@ list.
> >
> > There is a typo below: ancestor-->descendant (MXNet was introduced on
> 9/28).
> >
> >
> > -------- Original Message --------
> > Subject: development plan for SINGA
> > Date: 2015-11-10 22:28
> > From: ooibc <ooibc@comp.nus.edu.sg>
> > To: dev@singa.incubator.apache.org
> > Reply-To: dev@singa.incubator.apache.org
> >
> > Based on our quick check at the release of Tensorflow and online
> > discussions, it appears to be 2x slower than MXNet (ancestor of CXXNET)
> on
> > cifar10 dataset, and it contains older
> > codes like cudnn-v2. This could be just a form of crowdsourcing at work.
> >
> > SINGA is data flow centric in design, and provides simple interfaces,
> from
> > layer abstraction to neural net structure, model configuration/mapping,
> > model/data partitioning, function overriding, and training framework
> > configuration.
> >
> > So, we are good and we should keep to the development/release plan
> outlined
> > in
> >     http://singa.apache.org/develop/schedule.html
> >
> > Thanks, and Happy Deepavali (to those who celebrate)!
> >
> > regards
> > beng chin
>
>
>
> --
> Best Regards, Edward J. Yoon
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message