spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yanbo Liang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-17163) Decide on unified multinomial and binary logistic regression interfaces
Date Wed, 24 Aug 2016 07:48:21 GMT

    [ https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434412#comment-15434412
] 

Yanbo Liang commented on SPARK-17163:
-------------------------------------

I think it's hard to unify binary and multinomial logistic regression if we do not make any
breaking change.
* Like [~sethah] said, we need to find a way to unify the representation of {{coefficients}}
and {{intercept}}. I think flatten the matrix into a vector is still compromise, the best
representation should be matrix for {{coefficients}} and vector for {{intercept}} even it's
a binary classification problem. This will consistent with other ML models such as {{NaiveBayesModel}}
which is also support multi-class classification. 
* MLOR and LOR return different result for binary classification when regularization is used.
* Current LOR code base provide both {{setThreshold}} and {{setThresholds}} for binary logistic
regression and they have some interactions. If we make MLOR and LOR share the old LOR code
base, it will also introduce breaking change for these APIs.
* Model store/load compatibility.

I'm more prefer to keep LOR and MLOR in different APIs, but not very strongly hold my opinion
if you have better proposal. Thanks!

> Decide on unified multinomial and binary logistic regression interfaces
> -----------------------------------------------------------------------
>
>                 Key: SPARK-17163
>                 URL: https://issues.apache.org/jira/browse/SPARK-17163
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML, MLlib
>            Reporter: Seth Hendrickson
>
> Before the 2.1 release, we should finalize the API for logistic regression. After SPARK-7159,
we have both LogisticRegression and MultinomialLogisticRegression models. This may be confusing
to users and, is a bit superfluous since MLOR can do basically all of what BLOR does. We should
decide if it needs to be changed and implement those changes before 2.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message