singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SINGA-115) Print layer debug information in the neural net graph file
Date Fri, 25 Dec 2015 09:15:49 GMT

    [ https://issues.apache.org/jira/browse/SINGA-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071440#comment-15071440
] 

ASF subversion and git services commented on SINGA-115:
-------------------------------------------------------

Commit 1f977b1a1701ccf6958ebecb3ad3858da93b3578 in incubator-singa's branch refs/heads/master
from [~flytosky]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-singa.git;h=1f977b1 ]

SINGA-115 - Print layer debug information in the neural net graph file

Updated the Graph class to format the string representation in ToJson function.
The Layer::ToString function is updated to generate string for debug info, e.g., layer data
and gradient norm.
Insert code to collect debug info from layers forward and backward function in BPWorker class.
Add NeuralNet::ToGraph function to convert a neualnet into a Graph which can converted into
a string using ToJson function.

The Driver class would generate the node-link representation in a json file in workspace/visualization/train-net.json
file.

After enabling the debug option in job conf, i.e., `debug: true`, there will be two files
for each BP iteration, named fp-stepxx-locxxx.json and
bp-stepxxx-locxxx.json, representing the norm of data/gradient associated of each layer/param.
These files can be used to generate images for visualization via tools/graph.py
We may later consider generating images directly, like Graph::ToImage, using Graphviz library.


> Print layer debug information in the neural net graph file
> ----------------------------------------------------------
>
>                 Key: SINGA-115
>                 URL: https://issues.apache.org/jira/browse/SINGA-115
>             Project: Singa
>          Issue Type: New Feature
>            Reporter: wangwei
>
> It is non-trivial to debug the code for deep learning, e.g., the BP algorithm, the hybrid
partitioning and layer implementation. 
> In SINGA, we print the neural net in INFO log as json string, which can be converted
into an image with the net graph (nodes are layers). This graph can be used to check the neural
net configuration, e.g., layer connection and neural net partitioning. However, it does not
collect the run time data, e.g., gradient norm or value norm of each layer, which is important
to debug  accuracy etc. bugs.
> In this ticket, we will collect the gradient and value norm of each layer and each Param
object. These information will be printed as attributes (or sub-nodes) of the layer node in
the neural net graph. Users/developers can located the bugs by inspecting the graph after
converting the json string into an image.
> Particularly, uses can set the disp_freq to 1 and running steps to a small number, e.g.,
5. Then 5 neural net graphs will be printed, one per step. The debug option should be turned
on in the job.conf file for printing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message