singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [singa] chrishkchris edited a comment on pull request #697: New Model Layer Operator API
Date Tue, 02 Jun 2020 08:37:54 GMT

chrishkchris edited a comment on pull request #697:
URL: https://github.com/apache/singa/pull/697#issuecomment-637375715


   I am using this PR to train Xceptionnet in order to use the save_state function, but I
encountered something strange:
   
   (i) The training and evaluation were both okay in https://github.com/apache/singa/pull/651
   ```
   (singa) dcsysh@panda7:~/singa/examples/autograd$ python3 train.py xceptionnet ci
   Starting Epoch 0:
   Training loss = 11198.645508, training accuracy = 0.214420
   Evaluation accuracy = 0.309000, Elapsed Time = 606.547117s
   Starting Epoch 1:
   Training loss = 6354.611328, training accuracy = 0.381020
   Evaluation accuracy = 0.457300, Elapsed Time = 612.817129s
   ```
   
   (ii) This time I think the training is okay, but something wrong in the evaluation
   ```
   root@e8a757397ca3:~/dcsysh/singa/examples/cnn# mpiexec -np 8 python3 train_mpi.py xceptionnet
cifar10 --bs 16 --lr 0.04 --epoch 30
   Starting Epoch 0:
   Training loss = 11614.897461, training accuracy = 0.131190
   Evaluation accuracy = 0.099860, Elapsed Time = 98.705291s
   Starting Epoch 1:
   Training loss = 6932.552246, training accuracy = 0.157552
   Evaluation accuracy = 0.099860, Elapsed Time = 98.400360s
   Starting Epoch 2:
   Training loss = 6565.343262, training accuracy = 0.195853
   Evaluation accuracy = 0.099960, Elapsed Time = 99.807898s
   Starting Epoch 3:
   Training loss = 6173.305176, training accuracy = 0.254467
   Evaluation accuracy = 0.099960, Elapsed Time = 99.759293s
   Starting Epoch 4:
   Training loss = 5841.223633, training accuracy = 0.306430
   Evaluation accuracy = 0.099960, Elapsed Time = 99.962356s
   ```
   
   OK, I just remembered that the bug fix of recycling have not yet merged to the dev branch,
it was merged to the master branch. So this bug most properly has already been fixed 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message