singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-singa] chrishkchris opened a new pull request #535: SINGA-490 Optimize performance of stochastic gradient descent (SGD)
Date Wed, 18 Sep 2019 08:28:20 GMT
chrishkchris opened a new pull request #535: SINGA-490 Optimize performance of stochastic gradient
descent (SGD)
URL: https://github.com/apache/incubator-singa/pull/535
 
 
   I have fused the small operations of momentum SGD so as to increase GPU computation efficiency
and decrease latency. Moreover, I have added the Sync() function for better time profiling
in resnet.py (wait for the previous cuda operations to be finished before start calculating
the time).
   
   1. This is the new result after improving the momentum SGD:
   
   ```
   ubuntu@ip-172-31-39-137:~/incubator-singa/examples/autograd$ python3 mnist_cnn.py
   Starting Epoch 0:
   Training loss = 583.052124, training accuracy = 0.793690
   Evaluation accuracy = 0.943409, Elapsed Time = 4.191409s
   Starting Epoch 1:
   Training loss = 229.894424, training accuracy = 0.923609
   Evaluation accuracy = 0.961438, Elapsed Time = 4.170332s
   Starting Epoch 2:
   Training loss = 168.670303, training accuracy = 0.943937
   Evaluation accuracy = 0.964744, Elapsed Time = 4.186504s
   Starting Epoch 3:
   Training loss = 133.865494, training accuracy = 0.955259
   Evaluation accuracy = 0.978566, Elapsed Time = 4.188593s
   Starting Epoch 4:
   Training loss = 116.104378, training accuracy = 0.961730
   Evaluation accuracy = 0.971554, Elapsed Time = 4.195830s
   Starting Epoch 5:
   Training loss = 101.295425, training accuracy = 0.966299
   Evaluation accuracy = 0.974059, Elapsed Time = 4.191312s
   Starting Epoch 6:
   Training loss = 94.570869, training accuracy = 0.969684
   Evaluation accuracy = 0.977464, Elapsed Time = 4.181115s
   Starting Epoch 7:
   Training loss = 85.930618, training accuracy = 0.970968
   Evaluation accuracy = 0.984675, Elapsed Time = 4.182598s
   Starting Epoch 8:
   Training loss = 83.169617, training accuracy = 0.971768
   Evaluation accuracy = 0.985076, Elapsed Time = 4.202356s
   Starting Epoch 9:
   Training loss = 77.906853, training accuracy = 0.973969
   Evaluation accuracy = 0.982372, Elapsed Time = 4.191382s
   ubuntu@ip-172-31-39-137:~/incubator-singa/examples/autograd$ python3 resnet.py
   Start intialization............
   100%|███████████████████████████████████████████████████████████████████████|
100/100 [01:26<00:00,  1.14it/s]
   Throughput = 36.89267491263885 per second
   Total=0.8673808574676514, forward=0.2684857630729675, softmax=0.0027115750312805176, backward=0.5961835193634033,
sgd=0.03734057664871216
   ```
   
   2. This is the old result before improving the momentum SGD:
   ```
   ubuntu@ip-172-31-39-137:~/incubator-singa/examples/autograd$ python3 mnist_cnn.py
   Starting Epoch 0:
   Training loss = 581.382263, training accuracy = 0.794974
   Evaluation accuracy = 0.934495, Elapsed Time = 5.541576s
   Starting Epoch 1:
   Training loss = 233.281906, training accuracy = 0.920808
   Evaluation accuracy = 0.953025, Elapsed Time = 5.492121s
   Starting Epoch 2:
   Training loss = 169.505447, training accuracy = 0.943503
   Evaluation accuracy = 0.971454, Elapsed Time = 5.493372s
   Starting Epoch 3:
   Training loss = 136.643906, training accuracy = 0.954309
   Evaluation accuracy = 0.975761, Elapsed Time = 5.513660s
   Starting Epoch 4:
   Training loss = 116.743042, training accuracy = 0.960963
   Evaluation accuracy = 0.979968, Elapsed Time = 5.526858s
   Starting Epoch 5:
   Training loss = 103.864464, training accuracy = 0.965732
   Evaluation accuracy = 0.979667, Elapsed Time = 5.513694s
   Starting Epoch 6:
   Training loss = 94.542282, training accuracy = 0.968550
   Evaluation accuracy = 0.975461, Elapsed Time = 5.520474s
   Starting Epoch 7:
   Training loss = 87.548050, training accuracy = 0.971368
   Evaluation accuracy = 0.980970, Elapsed Time = 5.535038s
   Starting Epoch 8:
   Training loss = 83.162071, training accuracy = 0.971485
   Evaluation accuracy = 0.975661, Elapsed Time = 5.536836s
   Starting Epoch 9:
   Training loss = 78.447533, training accuracy = 0.974570
   Evaluation accuracy = 0.982772, Elapsed Time = 5.547574s
   ubuntu@ip-172-31-39-137:~/incubator-singa/examples/autograd$ python3 resnet.py
   Start intialization............
   100%|███████████████████████████████████████████████████████████████████████|
100/100 [01:49<00:00,  1.11s/it]
   Throughput = 29.05542749993395 per second
   Total=1.101343286037445, forward=0.270987823009491, softmax=0.0029543495178222657, backward=0.8274011135101318,
sgd=0.3130151700973511
   ```
   
   From above two sets of results (1) and (2), we can see that the new momentum SGD is much
faster after fusing the small operations.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message