Hi,
I was looking for logistic regression algorithms on hadoop.
mahout is one good package to use on hadoop, however I am not able to get
could results with my experiments.
There are logistic regression algorithms supported with WEKA which I have
used on Windows.
I guess I should be able to run these algos from JAR files as is on linux.
java classpath weka.jar weka.classifiers.functions.Logistic R 1.0E8 M 6
t lr.arff
Have anyone ported them to take advantage of hadoop ?
How to interpret the output generated from it like what is Coefficients and
Odds Ratios that could be used for classification ?
Options: R 1.0E8 M 6
Logistic Regression with ridge parameter of 1.0E8
Coefficients...
Class
Variable class_1
======================
a1 0
a2 0
a3 0
a4 0.0082
a5 0.0151
a6 0.1034
a7 0
a8 0
a9 0
a10 0.0397
a11 0.0003
a13 0.1195
a14 0.1389
Intercept 21.487
Odds Ratios...
Class
Variable class_1
======================
a1 1
a2 1
a3 1
a4 1.0083
a5 1.0152
a6 0.9018
a7 1
a8 1
a9 1
a10 0.961
a11 0.9997
a13 0.8873
a14 0.8703
Time taken to build model: 6.39 seconds
Time taken to test model on training data: 1.86 seconds
=== Error on training data ===
Correctly Classified Instances 49528 99.9173 %
Incorrectly Classified Instances 41 0.0827 %
Kappa statistic 0.9983
Mean absolute error 0.0011
Root mean squared error 0.0244
Relative absolute error 0.2202 %
Root relative squared error 4.895 %
Total Number of Instances 49569
=== Confusion Matrix ===
a b < classified as
26526 37  a = class_1
4 23002  b = class_2
=== Stratified crossvalidation ===
Correctly Classified Instances 49492 99.8447 %
Incorrectly Classified Instances 77 0.1553 %
Kappa statistic 0.9969
Mean absolute error 0.0015
Root mean squared error 0.0358
Relative absolute error 0.3108 %
Root relative squared error 7.1718 %
Total Number of Instances 49569
=== Confusion Matrix ===
a b < classified as
26532 31  a = class_1
46 22960  b = class_2
Thanks in advance.
Rajesh
