spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Pentreath (JIRA)" <>
Subject [jira] [Commented] (SPARK-21405) Add LBFGS solver for GeneralizedLinearRegression
Date Mon, 17 Jul 2017 06:45:02 GMT


Nick Pentreath commented on SPARK-21405:

Ok, sounds good to me. 

Do you think we would be able to re-use the {{logLikelihood}} in {{LogisticRegression}} and
{{LinearRegression}}? This is sort of the like the generalized {{LossFunction}} interface
I was alluding to in your optimizer abstraction work.

> Add LBFGS solver for GeneralizedLinearRegression
> ------------------------------------------------
>                 Key: SPARK-21405
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.3.0
>            Reporter: Seth Hendrickson
> GeneralizedLinearRegression in Spark ML currently only allows 4096 features because it
uses IRLS, and hence WLS, as an optimizer which relies on collecting the covariance matrix
to the driver. GLMs can also be fit by simple gradient based methods like LBFGS.
> The new API from [SPARK-19762|] makes
this easy to add. I've already prototyped it, and it works pretty well. This change would
allow an arbitrary number of features (up to what can fit on a single node) as in Linear/Logistic
> For reference, other GLM packages also support this - e.g. statsmodels, H2O.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message