Mailing-List: contact dev-help@singa.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@singa.incubator.apache.org
Date: Mon, 26 Sep 2016 11:18:20 +0000 (UTC)
From: "hacker99 (JIRA)" <jira@apache.org>
To: dev@singa.incubator.apache.org
Message-ID: <JIRA.13007449.1474814783000.662071.1474888700668@Atlassian.JIRA>
In-Reply-To: <JIRA.13007449.1474814783000@Atlassian.JIRA>
References: <JIRA.13007449.1474814783000@Atlassian.JIRA> <JIRA.13007449.1474814783621@arcas>
Subject: [jira] [Commented] (SINGA-249) Convolution BP
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
archived-at: Mon, 26 Sep 2016 11:18:27 -0000


    [ https://issues.apache.org/jira/browse/SINGA-249?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D15522=
762#comment-15522762 ]=20

hacker99 commented on SINGA-249:
--------------------------------

Thank you ! i am stupid,i should take a look at the code.

> Convolution BP
> --------------
>
>                 Key: SINGA-249
>                 URL: https://issues.apache.org/jira/browse/SINGA-249
>             Project: Singa
>          Issue Type: Wish
>         Environment: ubuntu 14.04=EF=BC=8Csinga 1.0
>            Reporter: hacker99
>
> I'm curious about how to calculate the gradient of the back propagation a=
lgorithm eg. Convolution layer. Can anyone explain to me the details of the=
 implementation of the formula and the code? Very grateful, if there is som=
e documents or just tell why   dw +=3D Mult(grad_b, col_data.T())?
> #code from src/model/layer/convolution.cc
> const std::pair<Tensor, vector<Tensor>> Convolution::Backward(
>     int flag, const Tensor &grad) {
>   CHECK_EQ(grad.device()->lang(), kCpp);
>   CHECK_EQ(grad.nDim(), 4u);
>   CHECK(!buf_.empty());
>   Tensor src_data =3D buf_.top();
>   buf_.pop();
>   vector<Tensor> param_grad;
>   Tensor dx;
>   Tensor db, dw;
>   dx.ResetLike(src_data);
>   db.ResetLike(bias_);
>   dw.ResetLike(weight_);
>   dw.SetValue(0.0f);
>   size_t batchsize =3D grad.shape(0);
>   size_t imagesize =3D src_data.Size() / batchsize;
>   if (bias_term_) {
>     Tensor tmp1 =3D
>         Reshape(grad, Shape{batchsize * num_filters_,
>                             grad.Size() / (batchsize * num_filters_)});
>     Tensor tmp2(Shape{batchsize * num_filters_});
>     SumColumns(tmp1, &tmp2);
>     Tensor tmp3 =3D Reshape(tmp2, Shape{batchsize, num_filters_});
>     SumRows(tmp3, &db);
>   }
>   auto in_data =3D src_data.data<float>();
>   Tensor col_data(Shape{col_height_, col_width_});
>   float *data_col =3D new float[col_height_ * col_width_];
>   float *dx_b =3D new float[imagesize];
>   for (size_t b =3D 0; b < batchsize; b++) {
>     Im2col(in_data + b * imagesize, channels_, height_, width_, kernel_h_=
,
>            kernel_w_, pad_h_, pad_w_, stride_h_, stride_w_, data_col);
>     col_data.CopyDataFromHostPtr(data_col, col_height_ * col_width_);
>     Tensor grad_b(Shape{num_filters_, conv_height_ * conv_width_});
>     CopyDataToFrom(&grad_b, grad, grad_b.Size(), 0, b * grad_b.Size());
>     dw +=3D Mult(grad_b, col_data.T());
>     Tensor dcol_b =3D Mult(weight_.T(), grad_b);
>     auto dcol_data =3D dcol_b.data<float>();
>     Col2im(dcol_data, channels_, height_, width_, kernel_h_, kernel_w_, p=
ad_h_,
>            pad_w_, stride_h_, stride_w_, dx_b);
>     dx.CopyDataFromHostPtr(dx_b, imagesize, b * imagesize);
>   }
>   param_grad.push_back(dw);
>   param_grad.push_back(db);
>   delete[] data_col;
>   delete[] dx_b;
>   return std::make_pair(dx, param_grad);
> }


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)