mxnet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] szhengac commented on a change in pull request #10350: Fix Gluon Language Model Example
Date Sun, 01 Apr 2018 06:48:43 GMT
szhengac commented on a change in pull request #10350: Fix Gluon Language Model Example
URL: https://github.com/apache/incubator-mxnet/pull/10350#discussion_r178449611
 
 

 ##########
 File path: example/gluon/word_language_model/train.py
 ##########
 @@ -159,19 +160,19 @@ def train():
             hidden = detach(hidden)
             with autograd.record():
                 output, hidden = model(data, hidden)
+                # Here L is a vector of size batch_size * bptt size
                 L = loss(output, target)
+                L = L / (args.bptt * args.batch_size)
                 L.backward()
 
             grads = [p.grad(context) for p in model.collect_params().values()]
-            # Here gradient is for the whole batch.
-            # So we multiply max_norm by batch_size and bptt size to balance it.
-            gluon.utils.clip_global_norm(grads, args.clip * args.bptt * args.batch_size)
+            gluon.utils.clip_global_norm(grads, args.clip)
 
-            trainer.step(args.batch_size)
+            trainer.step(1)
 
 Review comment:
   Yes, the loss has been rescaled manually. Also, we should rescale the loss by batch_size
* bptt instead. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message