spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DB Tsai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-4907) Inconsistent loss and gradient in LeastSquaresGradient compared with R
Date Fri, 26 Dec 2014 07:44:13 GMT

    [ https://issues.apache.org/jira/browse/SPARK-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14258981#comment-14258981
] 

DB Tsai commented on SPARK-4907:
--------------------------------

[~sowen] It seems that the existing document has 1/2 factor there in the formula. 

> Inconsistent loss and gradient in LeastSquaresGradient compared with R
> ----------------------------------------------------------------------
>
>                 Key: SPARK-4907
>                 URL: https://issues.apache.org/jira/browse/SPARK-4907
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>            Reporter: DB Tsai
>            Assignee: DB Tsai
>             Fix For: 1.3.0
>
>
> In most of the academic paper and algorithm implementations, people use L = 1/2n ||A
weights-y||^2 instead of L = 1/n ||A weights-y||^2 for least-squared loss. See Eq. (1) in
http://web.stanford.edu/~hastie/Papers/glmnet.pdf
> Since MLlib uses different convention, this will result different residuals and all the
stats properties will be different from GLMNET package in R. The model coefficients will be
still the same under this change. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message