mxnet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] jmacglashan opened a new issue #5677: Constant Initializer only works on keys ending with weight
Date Thu, 01 Jan 1970 00:00:00 GMT
jmacglashan opened a new issue #5677: Constant Initializer only works on keys ending with weight
URL: https://github.com/apache/incubator-mxnet/issues/5677
 
 
   The Constant initializer only overrides the _init_weight method. Consequently, there is
no good way to use existing initializers to initialize the values for bias terms, which always
are set to zero.
   
   Example code
   ```
   data = mx.sym.var('data')
   out = mx.sym.FullyConnected(data, num_hidden=3, name='out')
   ex = out.simple_bind(mx.cpu(), data=(1, 3))
   init = mx.initializer.Constant(1)
   init(mx.initializer.InitDesc('out_bias'), ex.arg_dict['out_bias'])
   print(ex.arg_dict['out_bias'].asnumpy())
   ```
   
   The same is true for Uniform, and Normal, and while an argument could be made that *typically*
when Uniform and Normal are used, biases are set to zero, this current design results in very
unexpected behavior. If I apply a Constant, Uniform, or Normal, initializer to a variable,
my expectation is that it will actually initialize the variable in that way! Only through
digging into the source code is it apparent that this expectation is violated and even if
it was made more apparent in the documentation, I don't think that's the right design anyway.
The functions should work as you would expect from their name. If someone wants to treat bias
terms uniquely, they should use a Mixed initializer to explicitly state that difference. Or,
if initing bias to zero is common enough, allow that mode with a flag in the constructor of
these initializers, so it's still explicit, but simpler than using Mixed.
   
   For more complex initialization strategies, it makes sense that it's variable dependent.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message