mxnet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <>
Subject [GitHub] [incubator-mxnet] apeforest commented on issue #15120: [bug] fix higher grad log
Date Thu, 06 Jun 2019 18:17:07 GMT
apeforest commented on issue #15120: [bug] fix higher grad log 
   @kshitij12345 I think it's because of the design of backward computation graph in MXNet.
In C++ implementation, when you specify variables=x, it will compute gradients for the input
   As in your case 2:
   x_grad = autograd.grad(heads=y, variables=x, head_grads=y_grad, create_graph=True, retain_graph=True)[0]
   If you perform another backward on x_grad as `x_grad.backward(out_grad=head_grads_grads)`,
y_grad is not listed as input variable and therefore it's gradient is zero
   As in your case 1:
   x_grad = x_grad_mid * y_grad # Note
   You implicitly made y_grad an input variable when calling backward on x_grad. And that
is why you will get values in y_grad.grad.
   I replaced the `backward()` method with an explicit `autograd.grad()` call, which should
call the same C++ backend function and result is different.
   case 1.1: if I do the following, I again don't get any values for y_grad because the output
only contains one gradient variable
   out_grad = autograd.grad(heads=x_grad, variables=x, head_grads=head_grad_grads, create_graph=False,
   print(out_grad[0])   # values equals to expected_grad_grad
   case 1.2: I explicitly set y_grad as input variable, I then get the expected result as
in your case 1
   out_grad = autograd.grad(heads=x_grad, variables=[x, y_grad], head_grads=head_grad_grads,
create_graph=False, retain_graph=False)
   print(out_grad[0])   # value equals to expected_grad_grad
   print(out_grad[1])   # value equals to expected_heads_grad
   At this point, I am not sure if this is a bug because the backward API is designed differently
from PyTorch. If y_grad is not specified as part of the input variables that need to perform
gradient on, it will not get values assigned even if you write `y_grad.attach_grad()` to it.
This seems to be consistent from API spec. Also, given that the value `y_grad` does not have
real useful values, I also don't feel the necessity to store it. Please let me know if this
makes sense. Thanks a lot for your careful drawing and insightful discussion.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

View raw message