spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ehsan Mohyedin Kermani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-8534) Gini for regression metrics and evaluator
Date Tue, 01 Sep 2015 22:52:47 GMT

    [ https://issues.apache.org/jira/browse/SPARK-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726364#comment-14726364
] 

Ehsan Mohyedin Kermani commented on SPARK-8534:
-----------------------------------------------

I'd like to give it a shot but first I think, we need distributed scan function for computing
the cumulative sum of the sorted predictions. Would it be possible to add that to RegressionMetrics
or perhaps mllib.util first? An implementation was suggested here https://groups.google.com/forum/#!topic/spark-users/ts-FdB50ltY.


> Gini for regression metrics and evaluator
> -----------------------------------------
>
>                 Key: SPARK-8534
>                 URL: https://issues.apache.org/jira/browse/SPARK-8534
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML, MLlib
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> One common metric we do not have in RegressionMetrics or RegressionEvaluator is Gini:
[https://www.kaggle.com/wiki/Gini]
> Implementing (normalized) Gini would be nice.  However, it might be expensive; I believe
it would require sorting the labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message