singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [singa] nudles commented on pull request #697: New Model Layer Operator API
Date Tue, 02 Jun 2020 14:05:06 GMT

nudles commented on pull request #697:
URL: https://github.com/apache/singa/pull/697#issuecomment-637565043


   > 
   > 
   > > > > > Summary:
   > > > > > ```
   > > > > > * set hx.creator and cx.creator to None. (Still can't use the
graph to train correctly but can be executed normally)
   > > > > > 
   > > > > > * create ReLU layer instance
   > > > > > ```
   > > > > 
   > > > > 
   > > > > update conv and linear layers to include an argument for `activation`
   > > > > > ```
   > > > > > * create Loss layer instance
   > > > > > 
   > > > > > * remove set_attribute function, just copy the initial value from
tensor directly. Raise warning in __setattr__ when the types do not match
   > > > > > ```
   > > > > 
   > > > > 
   > > > > Shall we totally disable reassignment, like self.W=..., because it
may affect the graph as it replace the tensor W with another tensor?
   > > > > > ```
   > > > > > * remove on_device function, get device info from input tensors
   > > > > > ```
   > > > 
   > > > 
   > > > But I'm not sure where the reassignment is used in the whole project. Maybe
it's used in many places. I think reassignment is still very common.
   > > 
   > > 
   > > I see.
   > > Will reassignment have any side effect to the computational graph?
   > 
   > Reassignment will not be buffered in the graph. If we want to update a tensor in the
graph, we may need to copy(will create an operator) the new value to the original tensor.
Just like we use Axpy to update parameters. Another strategy is to swap pointers of the two
blocks.
   > 
   > ```python
   > a = Tensor() # block1->ptr1->addr1
   > b = Tensor() # block2->ptr2->addr2
   > a = b # inc ref_count of block2, copy the data(memory address, size etc.) in block2
to block1.
   > # if the ref_count of block1 isn't equal to 1, this strategy will have some problems.
   > ```
   
   if the model is compiled and the graph is created, then we set do self.W=parameters[self.W.name]
(reassign W to another tensor). 
   Since the graph is created already, this reassignment will not be seen by the graph and
the graph will use the original tensor.
   Can we generate a warning or error when a tensor attribute is reassigned to a new tensor
by `__setattr__` when graph is on?  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message