[ https://issues.apache.org/jira/browse/MAHOUT228?page=com.atlassian.jira.plugin.system.issuetabpanels:alltabpanel
]
Ted Dunning updated MAHOUT228:

Attachment: r.csv
logP.csv
sgd.csv
I have been doing some testing on the training algorithm and there seems to be a glitch in
it. The problem is that the prior gradient is strong enough that for lambda > really small,
the regularization zeros out all of the coefficients on every iteration. Not good.
I will attach some sample data that I have been using for these experiments. These reference
for these experiments was an optimization I did in R where I explicitly optimized a simple
example and got very plausible results.
For the R example, I used the following definition of the function to optimize:
{noformat}
f < function(beta) {
p = w(rowSums(x %*% matrix(beta, ncol=1)));
r1 = sum(y*log(p+(p==0))+(1y)*log(1p+(p==1)));
r2=lambda*sum(abs(beta));
(r1+r2)
}
w < function(x) {
return(1/(1+exp(x)))
}
{noformat}
Here beta is the coefficient vector, lambda sets the amount of regularization, x are the input
vectors one observation per row, y are the known categories for the rows of x, f is the combined
log likelihood (r1) and log prior (r2), and w is the logistic function. I used an unsimplified
form for the overall logistic likelihood for simplicity. Normally, a simpler form is used
of sum(y  p), but I wanted to keep things straightforward.
The attached file sgd.csv contains the value of x. The value of y is simply 30 0's followed
by 30 1's.
Optimization was done using this:
{noformat}
lambda < 0.1
beta.01 < optim(beta,f, method="CG", control=list(maxit=10000))
lambda < 1
beta.1 < optim(beta,f, method="CG", control=list(maxit=10000))
lambda < 10
beta.10 < optim(beta,f, method="CG", control=list(maxit=10000))
{noformat}
The values for beta obtained are contained in the file r.csv and the logMAP likelihoods are
in logP.csv
I will shortly add a patch that has my initial test in it. This patch will contain these
test data files. I will be working on this problem off and on over the next few days, but
any hints that anybody has are welcome. My expectation is that there is a silly oversight
in my Java code.
> Need sequential logistic regression implementation using SGD techniques
> 
>
> Key: MAHOUT228
> URL: https://issues.apache.org/jira/browse/MAHOUT228
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Reporter: Ted Dunning
> Fix For: 0.3
>
> Attachments: logP.csv, MAHOUT2281.patch, MAHOUT2282.patch, r.csv, sgd.csv
>
>
> Stochastic gradient descent (SGD) is often fast enough for highly scalable learning (see
Vowpal Wabbit, http://hunch.net/~vw/).
> I often need to have a logistic regression in Java as well, so that is a reasonable place
to start.

This message is automatically generated by JIRA.

You can reply to this email to add a comment to the issue online.
