spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <men...@gmail.com>
Subject Re: confidence/probability for prediction in MLlib
Date Wed, 07 Jan 2015 05:17:02 GMT
This is addressed in https://issues.apache.org/jira/browse/SPARK-4789.
In the new pipeline API, we can simply output two columns, one for the
best predicted class, and the other for probabilities or confidence
scores for each class. -Xiangrui

On Tue, Jan 6, 2015 at 11:43 AM, Jianguo Li <flyingfromchina@gmail.com> wrote:
> Hi,
>
> A while ago, somebody asked about getting a confidence value of a prediction
> with MLlib's implementation of Naive Bayes's classification.
>
> I was wondering if there is any plan in the near future for the predict
> function to return both a label and a confidence/probability? Or could the
> private variables in the various machine learning models be exposed so we
> could write our own functions which return both?
>
> Having a confidence/probability could be very useful in real application.
> For one thing, you can choose to trust the predicted label only if it has a
> high confidence level. Also, if you want to combine the results from
> multiple classifiers, the confidence/probability could be used as some kind
> of weight for combining.
>
> Thanks,
>
> Jianguo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message