Hi Nikolay,
I appreciate your interest in the project. To answer your question: You
should be able to write "X%*%W + B" and get the semantics you want. The
SystemML compiler automatically pads vectors with copies of themselves when
it sees a cellwise operation between a matrix and a vector. So if you run
the DML code:
A = matrix (1.0, rows=3, cols=3)
v = matrix (2.0, rows=1, cols=3)
sum = A + v
print(toString(sum))
the output will be:
3.000 3.000 3.000
3.000 3.000 3.000
3.000 3.000 3.000
Exposing cellwise matrixvector operations to the SystemML optimizer in
this way should result in more efficient parallel plans, since it's easier
for the optimizer to detect that it can broadcast the vector and stream the
matrix.
The PNMF script on the SystemML home page (http://systemml.apache.org) has
a more indepth example of the same pattern:
while (iter < max_iterations) {
iter = iter + 1;
H = (H * (t(W) %*% (V/(W%*%H)))) / t(colSums(W));
W = (W * ((V/(W%*%H)) %*% t(H))) / t(rowSums(H));
obj = as.scalar(colSums(W) %*% rowSums(H))  sum(V * log(W%*%H));
print("iter=" + iter + " obj=" + obj);
}
The part in red divides the matrix (H * (t(W) %*% (V/(W%*%H)))) by the
vector t(colSums(W)). In R, the divisor in this expression would need to be
(t(matrix(colSums(W),nrow=1))%*%matrix(rep(1,m),nrow=1)) or something
equivalent.
I think that an example script for training Boltzmann machines would be a
useful addition to the SystemML distribution. Would you mind opening a JIRA
issue for adding this script and posting a link to the JIRA on the SystemML
mailing list? Our JIRA instance is at
https://issues.apache.org/jira/browse/SYSTEMML, and our mailing list is at
http://systemml.apache.org/community. By the way, it's good to post
questions like your question below the mailing list so that others who run
into the same issue will have an easier time finding the solution; I'm
CCing the list with my response here.
Fred
From: Nikolay Manchev/UK/IBM
To: Frederick R Reiss/Almaden/IBM@IBMUS
Date: 07/04/2016 01:55 PM
Subject: Question on SystemML  RBMs and repmat()
Hi Fred,
I got your email from Frank Ketlaars and it is my understanding that you
are one of our SystemML gurus.
I am doing some research around the use of Boltzmann Machines in big data
context, and I've been playing with SystemML for some time. I wrote a DML
script that trains a Restricted Boltzmann Machine using onestep
contrastive divergence, and the results look quite good so far. I've done
some functional testing using the MNIST data set, feeding the output from
about 200 hidden neurons to the SVM and Naive Bayes classifiers, and this
substantially reduces the convergence times without impacting the accuracy.
Anyway, I have one simple question and I was hoping that you can spare a
moment and provide some feedback. Because I use different update rules for
the weights and biases of the RBM, I need a function that can construct a
new matrix by repeating a vector a number of times (something like
numpy.tile() or Matlab's repmat()). I need this for constructing the bias
matrix B from a vector, in expressions like X %*% W + B.
I looked at the DML reference guide, but I couldn't identify anything that
can help with this. I wrote my own function that looks like this
rep = function(matrix[double] X, int times) return (matrix[double] retval)
{
retval = X
i = 1
while (i < times) {
retval = rbind(retval, X)
i = i + 1
}
}
but I am concerned about how effective this would be, and I keep wondering
if there is a better solution. Would you mind sharing your thoughts or
pointing me in the right direction?
If you are interested, the RBM scripts are available here (I've been
thinking about creating a pull request and see if anyone in the SystemML
dev team would be interested in adding them to SystemML):
https://github.com/nmanchev/incubatorsystemml/blob/neuralnets/scripts/algorithms/rbm_minibatch.dml
https://github.com/nmanchev/incubatorsystemml/blob/neuralnets/scripts/algorithms/rbm_run.dml
Kind regards
Nikolay
Nikolay Manchev
Data Scientist, Big Data Technical Team  Europe
IBM Analytics
Phone: 447919 565747 IBM
Email: nmanchev@uk.ibm.com
My personal blog: cleverowl.uk
