hivemall-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Makoto Yui (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVEMALL-78) AUC UDAF for BinaryClassificationMetrics
Date Tue, 21 Feb 2017 10:51:44 GMT

     [ https://issues.apache.org/jira/browse/HIVEMALL-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Makoto Yui updated HIVEMALL-78:
-------------------------------
    Description: 
Support Area Under ROC of Binary Classification Metrics.

{code:sql}
-- option 1
select
   auroc(label, prob) as auroc
from
   data;

-- option 2
WITH roc as (
  select
    roc(label, prob) as tpr, tpr
  from
    data
)
select
  auc(fpr, tpr) as auroc -- auc is UDAF, input is sorted by fp asc
from (  
  select 
    fpr, tpr
  from
    roc
  DISTRIBUTE BY 
    floor(fpr / 0.2) -- 5 bins
  SORT BY 
    fpr ASC
) t
{code}

Reference)
http://www.citeulike.org/user/myui/article/12615084
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala
https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html

  was:
http://www.citeulike.org/user/myui/article/12615084
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala

{code:sql}
-- option 1
select
   auroc(label, prob) as auroc
from
   data;

-- option 2
WITH roc as (
  select
    roc(label, prob) as tpr, tpr
  from
    data
)
select
  auc(fpr, tpr) as auroc -- auc is UDAF, input is sorted by fp asc
from (  
  select 
    fpr, tpr
  from
    roc
  DISTRIBUTE BY 
    floor(fpr / 0.2) -- 5 bins
  SORT BY 
    fpr ASC
) t
{code}


> AUC UDAF for BinaryClassificationMetrics
> ----------------------------------------
>
>                 Key: HIVEMALL-78
>                 URL: https://issues.apache.org/jira/browse/HIVEMALL-78
>             Project: Hivemall
>          Issue Type: New Feature
>            Reporter: Makoto Yui
>            Assignee: Takuya Kitazawa
>            Priority: Minor
>
> Support Area Under ROC of Binary Classification Metrics.
> {code:sql}
> -- option 1
> select
>    auroc(label, prob) as auroc
> from
>    data;
> -- option 2
> WITH roc as (
>   select
>     roc(label, prob) as tpr, tpr
>   from
>     data
> )
> select
>   auc(fpr, tpr) as auroc -- auc is UDAF, input is sorted by fp asc
> from (  
>   select 
>     fpr, tpr
>   from
>     roc
>   DISTRIBUTE BY 
>     floor(fpr / 0.2) -- 5 bins
>   SORT BY 
>     fpr ASC
> ) t
> {code}
> Reference)
> http://www.citeulike.org/user/myui/article/12615084
> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala
> https://spark.apache.org/docs/latest/mllib-evaluation-metrics.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message