spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 白刚 <baig...@staff.sina.com.cn>
Subject Re: Contributing to MLlib on GLM
Date Sat, 28 Jun 2014 01:44:47 GMT
Hi Xiaokai,

My bad. I didn't notice this before I created another PR for Poisson regression. The mails
were buried in junk by the corp mail master. Also, thanks for considering my comments and
advice in your PR.

Adding my two cents here:

* PoissonRegressionModel and GammaRegressionModel have the same fields and prediction method.
Shall we use one instead of two redundant classes? Say, a LogLinearModel.
* The LBFGS optimizer takes fewer iterations and results in better convergence than SGD. I
implemented two GeneralizedLinearAlgorithm classes using LBFGS and SGD respectively. You may
take a look into it. If it's OK to you, I'd be happy to send a PR to your branch.
* In addition to the generated test data, We may use some real-world data for testing. In
my implementation, I added the test data from https://onlinecourses.science.psu.edu/stat504/node/223.
Please check my test suite.

-Gang
Sent from my iPad

> On 2014年6月27日, at 下午6:03, "xwei" <weixiaokai@gmail.com> wrote:
> 
> 
> Yes, that's what we did: adding two gradient functions to Gradient.scala and
> create PoissonRegression and GammaRegression using these gradients. We made
> a PR on this.
> 
> 
> 
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-on-GLM-tp7033p7088.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
Mime
View raw message