mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Seeger <msee...@gmail.com>
Subject Re: soft relu gradient, is it correct?
Date Sun, 06 Jan 2019 15:32:53 GMT
Hi Pedro,

these are just helper functions, you need to check the operator. In this
case, the function is the derivative as function of the *output*, which is
cheaper to compute:

y = log(1 + exp(x)) => dy/dx = 1/(1 + exp(-x)) = 1 - exp(-y)

If you check all sorts of other ops, the same is the case. You need to
always check the code for the operator.

In any case, there are quite some unit tests, that would catch this, except
of course if people added functions after I did this, and
have not updated the unit tests.

Bye, Matthias


On Wed, Nov 21, 2018 at 12:52 AM Pedro Larroy <pedro.larroy.lists@gmail.com>
wrote:

> I bumped into the definition of the softrelu gradient:
>
>
> https://github.com/apache/incubator-mxnet/blob/master/src/operator/mshadow_op.h#L170
>
> Which is defined as  1- exp(-x)
>
> As we define the forward of the softrelu as the softplus function,
> shouldn't the gradient be the logistic function?
>
> Is my understanding that the gradient of the softrelu should go down
> to zero as Lim x -> -Inf  Which is not the case with the above
> definition which goes to -Inf as Lim x- > -Inf
>
> https://en.wikipedia.org/wiki/Rectifier_(neural_networks)
>
>
> Pedro.
>


-- 
Matthias Seeger

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message