hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Nikam <rajeshni...@gmail.com>
Subject WEKA logistic regression on hadoop
Date Tue, 16 Oct 2012 13:46:45 GMT
Hi,

I was looking for logistic regression algorithms on hadoop.
mahout is one good package to use on hadoop, however I am not able to get
could results with my experiments.

There are logistic regression algorithms supported with WEKA which I have
used on Windows.
I guess I should be able to run these algos from JAR files as is on linux.

java -classpath weka.jar weka.classifiers.functions.Logistic -R 1.0E-8 -M 6
-t lr.arff

Have anyone ported them to take advantage of hadoop ?

How to interpret the output generated from it like what is Coefficients and
Odds Ratios that could be used for classification ?


Options: -R 1.0E-8 -M 6

Logistic Regression with ridge parameter of 1.0E-8
Coefficients...
                 Class
Variable       class_1
======================
a1                   0
a2                   0
a3                   0
a4              0.0082
a5              0.0151
a6             -0.1034
a7                   0
a8                   0
a9                   0
a10            -0.0397
a11            -0.0003
a13            -0.1195
a14            -0.1389
Intercept      -21.487


Odds Ratios...
                 Class
Variable       class_1
======================
a1                   1
a2                   1
a3                   1
a4              1.0083
a5              1.0152
a6              0.9018
a7                   1
a8                   1
a9                   1
a10              0.961
a11             0.9997
a13             0.8873
a14             0.8703

Time taken to build model: 6.39 seconds
Time taken to test model on training data: 1.86 seconds

=== Error on training data ===

Correctly Classified Instances       49528               99.9173 %
Incorrectly Classified Instances        41                0.0827 %
Kappa statistic                          0.9983
Mean absolute error                      0.0011
Root mean squared error                  0.0244
Relative absolute error                  0.2202 %
Root relative squared error              4.895  %
Total Number of Instances            49569


=== Confusion Matrix ===

     a     b   <-- classified as
 26526    37 |     a = class_1
     4 23002 |     b = class_2



=== Stratified cross-validation ===

Correctly Classified Instances       49492               99.8447 %
Incorrectly Classified Instances        77                0.1553 %
Kappa statistic                          0.9969
Mean absolute error                      0.0015
Root mean squared error                  0.0358
Relative absolute error                  0.3108 %
Root relative squared error              7.1718 %
Total Number of Instances            49569


=== Confusion Matrix ===

     a     b   <-- classified as
 26532    31 |     a = class_1
    46 22960 |     b = class_2

Thanks in advance.
Rajesh

Mime
View raw message