singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [singa] dcslin edited a comment on pull request #697: New Model Layer Operator API
Date Mon, 25 May 2020 10:21:44 GMT

dcslin edited a comment on pull request #697:
URL: https://github.com/apache/singa/pull/697#issuecomment-633499903


   > > there is no params explicitly defined by MyModel; therefore there should be no
direct params of MyModel. All params are from the sublayers of MyModel, i.e., l1 and l2.
   > 
   > There is a case for params explicitly defined by MyModel, for example, singa native
RNN implementation by default let the model handle the initial states/hidden value like:
   > https://github.com/apache/singa/blob/master/examples/rnn/train.py#L47-L48
   
   Another caveat is that for RNN/LSTM network, there is hidden/cell state, `self.hx` and
`self.cx`, defined in `mymodel` level, outside lstm layer class. During `mymodel` compilation,
in forward pass, operators are instantiated, and `creator` of intermediate tensors are assigned.
Now the `self.hx.creator` points to the last operator of forward(compile) pass, in lstm it
is `mul`, in rnn it is `tanh`. Later if executing training directly, backward pass would failed
when trying do backward all the way back to operators in `compile`.
   For now I need to manual reset the `self.hx.creator` after compile.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message