Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 213E811C84 for ; Mon, 23 Jun 2014 10:03:29 +0000 (UTC) Received: (qmail 42081 invoked by uid 500); 23 Jun 2014 10:03:27 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 42027 invoked by uid 500); 23 Jun 2014 10:03:27 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@spark.apache.org Delivered-To: mailing list user@spark.apache.org Received: (qmail 42017 invoked by uid 99); 23 Jun 2014 10:03:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Jun 2014 10:03:27 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [15.201.208.53] (HELO g4t3425.houston.hp.com) (15.201.208.53) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Jun 2014 10:03:21 +0000 Received: from G4W6310.americas.hpqcorp.net (g4w6310.houston.hp.com [16.210.26.217]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by g4t3425.houston.hp.com (Postfix) with ESMTPS id 41A47251 for ; Mon, 23 Jun 2014 10:03:01 +0000 (UTC) Received: from G4W6304.americas.hpqcorp.net (16.210.26.229) by G4W6310.americas.hpqcorp.net (16.210.26.217) with Microsoft SMTP Server (TLS) id 14.3.169.1; Mon, 23 Jun 2014 10:02:30 +0000 Received: from G4W3292.americas.hpqcorp.net ([169.254.1.199]) by G4W6304.americas.hpqcorp.net ([16.210.26.229]) with mapi id 14.03.0169.001; Mon, 23 Jun 2014 10:02:30 +0000 From: "Ulanov, Alexander" To: "user@spark.apache.org" Subject: Multiclass classification evaluation measures Thread-Topic: Multiclass classification evaluation measures Thread-Index: Ac+OydyKWHNKxUmKQmqU84huzUj9yg== Date: Mon, 23 Jun 2014 10:02:29 +0000 Message-ID: <9D5B00849D2CDA4386BDA89E83F69E6C0FC9D2EB@G4W3292.americas.hpqcorp.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [16.210.48.18] Content-Type: multipart/alternative; boundary="_000_9D5B00849D2CDA4386BDA89E83F69E6C0FC9D2EBG4W3292americas_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_9D5B00849D2CDA4386BDA89E83F69E6C0FC9D2EBG4W3292americas_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, I've implemented a class with measures for evaluation of multiclass classif= ication (as well as unit tests). They are per class and averaged Precision,= Recall and F1-measure. As far as I know, in Spark, there is binary classif= ication evaluator only, given that Spark's Bayesian classifier supports mul= ticlass. I've submitted a pull request https://github.com/apache/spark/pull= /1155 following the guidelines on https://cwiki.apache.org/confluence/displ= ay/SPARK/Contributing+to+Spark Admins didn't yet verify my patch. I have few questions: 1)Do I need to contact somebody to be verified (my Github profile isn't lin= ked with my personal page)? 2)Spark users, would you like other measures to be implemented or any other= features for multiclass evaluator, for example, Accuracy or returning the = confusion matrix? Best regards, Alexander --_000_9D5B00849D2CDA4386BDA89E83F69E6C0FC9D2EBG4W3292americas_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

 

I’ve implemented a class with measures for eva= luation of multiclass classification (as well as unit tests). They are per = class and averaged Precision, Recall and F1-measure. As far as I know, in S= park, there is binary classification evaluator only, given that Spark’s Bayesian classifier supports multiclass. I&= #8217;ve submitted a pull request https://github.com/ap= ache/spark/pull/1155 following the guidelines on https://cwiki.apache.o= rg/confluence/display/SPARK/Contributing+to+Spark

Admins didn’t yet verify my patch. I have few = questions:

1)Do I need to contact somebody to be verified (my G= ithub profile isn’t linked with my personal page)?

2)Spark users, would you like other measures to be i= mplemented or any other features for multiclass evaluator, for example, Acc= uracy or returning the confusion matrix?

 

Best regards, Alexander

 

--_000_9D5B00849D2CDA4386BDA89E83F69E6C0FC9D2EBG4W3292americas_--