mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel McEnnis <dmcen...@gmail.com>
Subject Re: Used my own data for the 20NewsGroup example. TestClassifier giving incorrect output
Date Wed, 11 May 2011 17:03:37 GMT
Dipti,

Double check that your classify data is in category\ttokenized text
format (i.e. the testclassifier data builder rather than the
classifier data builder).

Daniel.

On Wed, May 11, 2011 at 9:42 AM, Dipti Mathur <diptidmathur@gmail.com> wrote:
> Hi All,
>
> I used the 20NewsGroup model to train my data. However, while trying to test
> the classifier (test data is same as train data just for simplicity sake
> now), I get the following error. Any ideas?
>
> dipti@dipti-laptop:~$ mahout/trunk/bin/mahout testclassifier -m
> ruralsearch/bayes-model/ -d ruralsearch/test-input/ -type bayes -ng 1
> -source hdfs -method sequential
> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20.2/
> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20.2/conf
> 11/05/11 19:02:35 INFO bayes.TestClassifier: Loading model from:
> {basePath=ruralsearch/bayes-model/, classifierType=bayes, alpha_i=1.0,
> dataSource=hdfs, gramSize=1, verbose=false, encoding=UTF-8,
> defaultCat=unknown, testDirPath=ruralsearch/test-input/}
> 11/05/11 19:02:35 INFO bayes.TestClassifier: Testing Bayes Classifier
> 11/05/11 19:02:36 INFO io.SequenceFileModelReader: 135467.11329474236
> 11/05/11 19:02:37 INFO datastore.InMemoryBayesDatastore: realestate
> -103464.88819958708 168594.15797711344 -0.6136920130627087
> 11/05/11 19:02:37 INFO datastore.InMemoryBayesDatastore: automobiles
> -168594.15797711344 168594.15797711344 -1.0
> 11/05/11 19:02:37 INFO bayes.TestClassifier:
> =======================================================
> Summary
> -------------------------------------------------------
> Correctly Classified Instances          :          0         �%
> Incorrectly Classified Instances        :          0         �%
> Total Classified Instances              :          0
>
> =======================================================
> Confusion Matrix
> -------------------------------------------------------
> a     b     c     <--Classified as
> 0     0     0     |  0     a     = realestate
> 0     0     0     |  0     b     = automobiles
> 0     0     0     |  0     c     = unknown
> Default Category: unknown: 2
>
>
> 11/05/11 19:02:37 INFO driver.MahoutDriver: Program took 2309 ms
>
> Regards,
> Dipti Mathur
>

Mime
View raw message