flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2297) Add threshold setting for SVM binary predictions
Date Wed, 01 Jul 2015 09:48:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609840#comment-14609840
] 

ASF GitHub Bot commented on FLINK-2297:
---------------------------------------

Github user thvasilo commented on a diff in the pull request:

    https://github.com/apache/flink/pull/874#discussion_r33663520
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/SVM.scala
---
    @@ -242,8 +275,21 @@ object SVM{
             }
           }
     
    -      override def predict(value: T, model: DenseVector): Double = {
    -        value.asBreeze dot model.asBreeze
    +      override def predict(value: T, model: DenseVector, predictParameters: ParameterMap):
    +        Double = {
    +        val thresholdOption = predictParameters.get(Threshold)
    +
    +        val rawValue = value.asBreeze dot model.asBreeze
    +        // If the Threshold option has been reset, we will get back a Some(None) thresholdOption
    +        // causing the exception when we try to get the value. In that case we just return
the
    +        // raw value
    +        try {
    +          val thresOptionValue = thresholdOption.get
    +          if (rawValue > thresOptionValue) 1.0 else -1.0
    +        }
    +        catch {
    +          case e: java.lang.ClassCastException => rawValue
    +        }
    --- End diff --
    
    This relates to the previous discussion:
    
    I do believe we want this turned on by default, when you train a binary classifier you
expect that `predict` will return binary labels, not the decision function values.
    
    So if we have `None` as default, the user could write:
    
    ```scala
    val svm = SVM().
          setBlocks(env.getParallelism)
    
    svm.fit(train)
    val eval = svm.evaluate(test)
    ```
    
    and the eval output would not make sense, but if he wrote
    
    ```scala
    val svm = SVM().
          setBlocks(env.getParallelism).
          setThreshold(0.0)
    
    svm.fit(train)
    val eval = svm.evaluate(test)
    ```
    
    it would.


> Add threshold setting for SVM binary predictions
> ------------------------------------------------
>
>                 Key: FLINK-2297
>                 URL: https://issues.apache.org/jira/browse/FLINK-2297
>             Project: Flink
>          Issue Type: Improvement
>          Components: Machine Learning Library
>            Reporter: Theodore Vasiloudis
>            Assignee: Theodore Vasiloudis
>            Priority: Minor
>              Labels: ML
>             Fix For: 0.10
>
>
> Currently SVM outputs the raw decision function values when using the predict function.
> We should have instead the ability to set a threshold above which examples are labeled
as positive (1.0) and below negative (-1.0). Then the prediction function can be directly
used for evaluation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message