spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Pentreath (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML
Date Tue, 17 Jan 2017 15:09:26 GMT

    [ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826186#comment-15826186
] 

Nick Pentreath commented on SPARK-14409:
----------------------------------------

Yes to be more clear, I would expect that the {{k}} param would be specified as in Danilo's
version, for example. I do like the use of windowing to achieve the sort within each user.

This approach would also not work well with purely implicit data (unweighted). If everything
is relevant in the ground truth then the model would score perfectly each time. It sort of
works for the explicit rating case or the implicit case with "preference weights" since the
ground truth then has an inherent ordering. 

Still I think the evaluator must be able to deal with the case of generating recommendations
from the full item set. This means that the "label" and "prediction" columns could contains
nulls.
e.g. where an item exists in the ground truth but is not recommended (hence no score), the
"prediction" column would be null. While if an item is recommended but is not in ground truth,
the "label" column would be null. See my comments in SPARK-13857 for details.

> Investigate adding a RankingEvaluator to ML
> -------------------------------------------
>
>                 Key: SPARK-14409
>                 URL: https://issues.apache.org/jira/browse/SPARK-14409
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: Nick Pentreath
>            Priority: Minor
>
> {{mllib.evaluation}} contains a {{RankingMetrics}} class, while there is no {{RankingEvaluator}}
in {{ml.evaluation}}. Such an evaluator can be useful for recommendation evaluation (and can
be useful in other settings potentially).
> Should be thought about in conjunction with adding the "recommendAll" methods in SPARK-13857,
so that top-k ranking metrics can be used in cross-validators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message