mxnet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-mxnet] liuzh91 opened a new issue #17184: Weight tied cannot be used with weight sharing
Date Fri, 27 Dec 2019 07:38:34 GMT
liuzh91 opened a new issue #17184: Weight tied cannot be used with weight sharing
URL: https://github.com/apache/incubator-mxnet/issues/17184
 
 
   ## Description
   We experience some weight initialization error when we use weight sharing and weight tied
simultaneously. We share weights between `model` and `model_eval`. The code is shown below:
   
   ```python
   model = nlp.model.train.AWDRNN(args.model, len(vocab), args.emsize, args.nhid, args.nlayers,
                                  args.tied, args.dropout, args.weight_dropout,
                                  args.dropout_h, args.dropout_i, args.dropout_e)
   model_eval = nlp.model.AWDRNN(args.model, len(vocab), args.emsize, args.nhid, args.nlayers,
                                 args.tied, args.dropout, args.weight_dropout,
                                 args.dropout_h, args.dropout_i, args.dropout_e,
                                 params=model.collect_params())
   
   model.initialize(mx.init.Xavier(), ctx=context)
   
   model.hybridize(static_alloc=True)
   
   print(model)
   
   def check_initialized(net):
       params = net.collect_params()
       for param in params:
           try:
               params[param].list_ctx()
           except RuntimeError:
               return False
       return True
   
   print(check_initialized(model))
   print(check_initialized(model_eval))
   ```
   
   ### Log Message
   If `args.tied` is set `True`, we get the following log message:
   ```python
   True
   False
   ```
   If we turn off `args.tied`, the initialization works correctly.
   
   ### To Reproduce
   (If you developed your own code, please provide a short script that reproduces the error.
For existing examples, please provide link.)
   The file can be found in (https://github.com/dmlc/gluon-nlp/blob/v0.8.x/scripts/language_model/word_language_model.py).
To reproduce the above message, you may need to replace line 153 onward  with the above code
snippet. Run the following command:
   
   ```
   python -m pdb word_language_model.py --tied --dropout_e=0
   ```
   You will encounter the above error.
   
   ## What have you tried to solve it?
   
   The parameter that not initialized properly is the parameter `awdrnn0_hybridsequential0_embedding0_bias`.
It is the weight used in the decoder of AWDRNN.  After some investigation, we found it is
the tied weights introducing this error:
   
   ```python
       if self._tie_weights:
            output.add(nn.Dense(self._vocab_size, flatten=False,
                                params=self.embedding[0].params))
   ```
   
   I print some debug information which may be helpful here:
   ```
   (Pdb) model_eval.decoder[0]._params['awdrnn0_hybridsequential0_embedding0_bias'].list_ctx()
   *** RuntimeError: Parameter 'awdrnn0_hybridsequential0_embedding0_bias' has not been initialized
   ```
   
   ## Environment
   
   We recommend using our script for collecting the diagnositc information. Run the following
command and paste the outputs below:
   ```
   curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py
| python
   
   # paste outputs here
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message