# mahout-dev mailing list archives

##### Site index · List index
Message view
Top
From "Ted Dunning (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques
Date Fri, 25 Dec 2009 21:02:29 GMT
```
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ted Dunning updated MAHOUT-228:
-------------------------------

Attachment: r.csv
logP.csv
sgd.csv

I have been doing some testing on the training algorithm and there seems to be a glitch in
it.  The problem is that the prior gradient is strong enough that for lambda > really small,
the regularization zeros out all of the coefficients on every iteration.  Not good.

I will attach some sample data that I have been using for these experiments.  These reference
for these experiments was an optimization I did in R where I explicitly optimized a simple
example and got very plausible results.

For the R example, I used the following definition of the function to optimize:

{noformat}
f <- function(beta) {
p = w(rowSums(x %*% matrix(beta, ncol=1)));
r1 = -sum(y*log(p+(p==0))+(1-y)*log(1-p+(p==1)));
r2=lambda*sum(abs(beta));
(r1+r2)
}

w <- function(x) {
return(1/(1+exp(-x)))
}
{noformat}
Here beta is the coefficient vector, lambda sets the amount of regularization, x are the input
vectors one observation per row, y are the known categories for the rows of x, f is the combined
log likelihood (r1) and log prior (r2), and w is the logistic function.  I used an unsimplified
form for the overall logistic likelihood for simplicity.  Normally, a simpler form is used
of -sum(y - p), but I wanted to keep things straightforward.

The attached file sgd.csv contains the value of x.  The value of y is simply 30 0's followed
by 30 1's.

Optimization was done using this:
{noformat}
lambda <- 0.1
beta.01 <- optim(beta,f, method="CG", control=list(maxit=10000))
lambda <- 1
beta.1 <- optim(beta,f, method="CG", control=list(maxit=10000))
lambda <- 10
beta.10 <- optim(beta,f, method="CG", control=list(maxit=10000))
{noformat}
The values for beta obtained are contained in the file r.csv and the log-MAP likelihoods are
in logP.csv

I will shortly add a patch that has my initial test in it.  This patch will contain these
test data files.  I will be working on this problem off and on over the next few days, but
any hints that anybody has are welcome.  My expectation is that there is a silly oversight
in my Java code.

> Need sequential logistic regression implementation using SGD techniques
> -----------------------------------------------------------------------
>
>                 Key: MAHOUT-228
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-228
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>            Reporter: Ted Dunning
>             Fix For: 0.3
>
>         Attachments: logP.csv, MAHOUT-228-1.patch, MAHOUT-228-2.patch, r.csv, sgd.csv
>
>
> Stochastic gradient descent (SGD) is often fast enough for highly scalable learning (see
Vowpal Wabbit, http://hunch.net/~vw/).
> I often need to have a logistic regression in Java as well, so that is a reasonable place
to start.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

```
Mime
View raw message