No. Loss != Inaccuracy.
If you want to compute the accuracy, you need to create an
evaluator=singa.Accuracy(), and call evaluator.Evaluate(o, t), where o is
the output from the dense layer and t is the ground truth tensor. You can
follow the example here
https://github.com/apache/incubatorsinga/blob/master/python/singa/metric.py#L67
.
Good Luck!
On Sun, Oct 9, 2016 at 12:08 PM Arash Shafiei <arash.shafiei@gmail.com>
wrote:
> Thanks for the hint.
>
> I was sending it to the device but the problem turned out to be that I did
> not cast labels to int32.
>
> Now it is working and I am getting:
>
> [................... ] 96.4% training loss = 0.003444
> Epoch 49, train loss is 0.003509
> Epoch 49, evaluation loss is 0.003534
>
> Does this mean that after 50 epoch the evaluation has only 3.5% inaccuracy?
>
> On Sun, Oct 9, 2016 at 11:29 AM, Wei Wang <wangwei.cs@gmail.com> wrote:
>
> Have you moved all tensor onto the same devices? Including the tensor for
> the labels.
>
>
> On 9 Oct 2016, at 11:02 AM, Arash Shafiei <arash.shafiei@gmail.com> wrote:
>
> outputs = rnn.forward(model_pb2.kTrain, inputs)[0:2]
> grads = []
> batch_loss = 0
> g_dense_w.set_value(0.0)
> g_dense_b.set_value(0.0)
> print 'outputs len', len(outputs) // 128
> output = outputs[1]
> act = dense.forward(model_pb2.kTrain, output)
> print 'output shape', output.shape // (256, 28)
> print 'activation shape', act.shape // (256, 6)
> print 'labels shape', labels.shape // (256, 6)
> lvalue = lossfun.forward(model_pb2.kTrain, act, labels)
> batch_loss += lvalue.l1() // [F d1009 t11:00:24 p23551:016
> /home/wuwf/work/incubatorsinga/src/core/tensor/./tensor_math_cuda.h:344]
> Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0)
> CUBLAS_STATUS_MAPPING_ERROR
> Aborted (core dumped)
>
>
>
>
> On Sun, Oct 9, 2016 at 10:55 AM, Wei Wang <wangwei.cs@gmail.com> wrote:
>
> Could you please paste the relevant code leading to this error?
>
>
>
> On 9 Oct 2016, at 10:32 AM, Arash Shafiei <arash.shafiei@gmail.com> wrote:
>
> Thanks, it worked.
>
> So far, I managed to do rnn::forward(...) but now I am stuck somewhere
> else.
>
> rnn::forward(...) returns a tensor (denoted as lvalue). I have to obtain
> the L1 norm using lvalue.l1().
>
> But I get this error:
> [F d1009 t10:30:14 p23056:56
> /home/wuwf/work/incubatorsinga/src/core/tensor/./tensor_math_cuda.h:344]
> Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0)
> CUBLAS_STATUS_MAPPING_ERROR
> Aborted (core dumped)
>
> On Sat, Oct 8, 2016 at 9:43 PM, Wang Wei <wangwei@comp.nus.edu.sg> wrote:
>
> Actually, the charrnn example is from type (4), where each rnn unit would
> generate a prediction and has a ground truth label.
>
> For your model (type 2), you only need to use the y128 (of shape 256, 28)
> from the rnn::forward() as the input to the dense layer. All other yi
> should be ignored.
> Consequently, you would have an output (denoted as o) of shape (256, 6)
> from the dense layer, which is the prediction for the whole sequence (of
> length 128).
> By feeding the prediction o and the label into the loss layer, you can
> compute the loss value and compute the gradient for o (denoted as o').
> Backward propagating the o through the dense layer, you would get the
> gradient for y128, denoted as y'128.
>
> *The input of the rnn::backward() would be <y'1, y'2, ...y'128, hy', cy'>,
> where only y'128 is a valid tensor. y'1, y'2 ... should be tensor with
> value 0.*
>
> Best,
> Wei
>
>
> On Sat, Oct 8, 2016 at 9:33 PM Arash Shafiei <arash.shafiei@gmail.com>
> wrote:
>
> Thanks. It worked.
>
> I am now at the phase of evaluating the loss.
>
> singa.loss.SoftmaxCrossEntropy has a forward function where it takes
> prediction tensors and ground truth.
>
> My problem now is that the prediction is a sequence and my label is not a
> sequence.
>
> Your charrnn example is an application of type (1) in the figure bellow,
> but activity recognition is an application of type (2).
>
>
> <rnnapp.png>
> Therefore for each sequence in a batch I have only 1 label. (although this
> label can be of one dimension from the set of {1,2,3,4,5,6} or of 6
> dimension from the set of { [1,0,0,0,0,0], [0,1,0,0,0,0] , etc. }
>
> So now I need predictions and ground truth. The prediction for me is of
> shape
> (128, 256, 28)
> where 128 is the length of the sequence, 256 is the batch size and 28 is
> the hidden layer size.
>
> And my ground truth is of shape
> (256, 1) or (256, 6)  depending on how you model it..
>
> But as I understood from the example of charrnn my ground truth must be
> of shape:
> (128, 256)
>
> Would you have any insight about this?
> Thanks..
>
>
> On Sat, Oct 8, 2016 at 6:42 PM, Wang Wei <wangwei@comp.nus.edu.sg> wrote:
>
> Currently, numpy array of dtype=np.float32 or np.int could be converted
> into singa tensor.
> Please convert the numpy array into np.float32 and then call
> tensor.from_numpy(t) (without dtype=np.float32).
>
> On Sat, Oct 8, 2016 at 6:36 PM Arash Shafiei <arash.shafiei@gmail.com>
> wrote:
>
> The values that I have are floating points [1 1].
>
> While using tensor.from_numpy(...), I was getting this error:
>
> Not implemented yet for float64
>
> I understood from the tutorial that we could pass the data type:
>
> y = tensor.from_numpy(..., dtype=np.float32)
>
> But using dtype, I am getting another error:
>
> TypeError: from_numpy() got an unexpected keyword argument 'dtype'
>
>
>
> On Sat, Oct 8, 2016 at 3:45 PM, Wang Wei <wangwei@comp.nus.edu.sg> wrote:
>
> Hi
>
> According to the API of forward function:
> http://singa.apache.org/en/docs/layer.html#singa.layer.RNN.forward
> The input should be a vector of Tensors, <x1, x2, ... x128, hx, cx>, xi is
> of shape (1500, 9), hx and cx are optional whose shape should be (1500, 28).
> The output would be a vector of Tensors, <y1, y2, ..., y128, hy, cy>, yi
> is of shape (1500, 28), hy and cy are optional depending on the existence
> of hx and cx.
> If you want to put the dense layer on top of the last rnn unit (i.e. the
> 128th), then you feed y128 to the dense layer.
>
> function convert just reshapes the raw data into a sequence of tensors
> <x1, x2, ..>.
>
> BTW, typically, people would use a smaller batchsize e.g. less than 256.
>
> May I forward our discussion to the incubator email list in case others
> have similar problems?
> Thanks.
>
> Best,
> Wei
>
> So here what I have:
>
> input batch of dimension (1500, 128, 9)
> This means a batch of 1500 windows each having 128 vector of 9 dimensions.
>
> input label of dimension (1500, 6)
> This means a label batch of 1500 vector of 6 dimensions. This is to label
> if the person is sitting ([1,0,0,0,0,0]) or standing ([0,1,0,0,0,0]), etc.
>
> I am creating an lstm layer with hidden_size=28 and
> input_sample_shape=(9,) and num_stacks=1
>
> Then I create a dense layer with num_output=6 and input_sample_shape=(28,)
>
> Now I would like to feed the data to the 'forward' function of lstm and
> dense layer. But I could not make it work and I could not quit understand
> from the example what 'convert' and 'numpy2tensors' are suppose to do...
>
> I would appreciate your comments..
>
> On Sun, Sep 25, 2016 at 12:23 PM, Arash Shafiei <arash.shafiei@gmail.com>
> wrote:
>
> Yes, I was thinking of batch size to be 32.
>
> Thanks. I am getting more how it works and I am thinking how RNN would be
> helpful. Because we do not want to predict a sequence. We just have a
> sequence (in raw data) and a set of features (in processed data) and we
> want to know the classification.
>
> So I was thinking of using other approaches with SINGA. I understood that
> there is also MLP. We could use MLP from SINGA to see the result first.
>
> In this case input would be a set of 561 values with a label.
> Then the MLP, given a set of test data with 561 features would predict the
> label.
>
> Thanks for advices..
>
>
>
> On Sun, Sep 25, 2016 at 12:03 PM, Wang Wei <wangwei@comp.nus.edu.sg>
> wrote:
>
>
>
> On Sun, Sep 25, 2016 at 9:37 AM, Arash Shafiei <arash.shafiei@gmail.com>
> wrote:
>
> Hi Wang Wei,
>
> I am trying to understand the charnn example, but there is still
> something that I am missing and cannot figure is out by myself.
>
> The convert function creates two numpy array x and y. As I understood the
> array x is the data and array y are labels.
>
> I checked the dimentions of these arrays.
> x.shape is (32, 100, 101)
> y.shape is (32, 100)
>
> 32 is the batch size
> 100 is the sequence size
> 101 is the vocabulary size, i.e. there ae 101 unique chars in the
> linux_input.txt. each input from one sample and at one time step is a
> onehot vector with all positions being 0 except the position of the
> character (set to 1).
>
>
> given a sequence of chars, a,b,c,d,e,f
> if the input (x) is a, b, c, d, e
> then the label is b, c, d, e, f
>
>
>
> In my understanding you are taking a batch of 100 character and the next
> character must be the label. So according to my understanding
> x.shape must be (32, 100)
> y.shape must be (32, 1)
>
> I mean that you have a batch of 32 sample to train and each sample is a
> series of 100 character. For each sample, there must be a label, which says
> what character must follow this series. And that character is only 1.
>
> Is there anything that I do not quit understand?
>
> I would need this information in order to modify your sample program for
> the activity recognition.
> So ultimately in my use case:
> x.shape probably is (32, 561)
> y.shape probably is (32, 1)
>
>
> For you case, if you use 561 features, then how about the sequence length?
> Is 32 the batchsize?
>
> 561 are floating point features which is between [1:1].
> 1 is the label which is in [1,2,3,4,5,6]
>
> I would appreciate your help.
> Thanks.
>
>
