mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Divya" <di...@k2associates.com.sg>
Subject classification example queries
Date Wed, 24 Nov 2010 02:07:00 GMT
Hi,

 

I am following  https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html  to
run the 20 news example for classification 

 

I have few queries regarding that :

 

When I follow below steps 

1) Extract dataset

  tar zxf 20news-bydate.tar.gz 

 

2)generate input data set with to train the classifier

$ bin/mahout org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups  -p
D:/mahout-0.4/examples/bin/work/20news-bydate/20news-bydat

e-train -o D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-train-input
-a org.apache.mahout.vectorizer.DefaultAnalyzer  -c UTF-8

 

3)Generate train input data set to test the classifier 

bin/mahout org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups  -p
D:/mahout-0.4/examples/bin/work/20news-bydate/20news-bydate-test

-o D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-test-input -a
org.apache.mahout.vectorizer.DefaultAnalyzer  -c UTF-8

 

4)Train the classifier

bin/mahout trainclassifier -i
examples/bin/work/20news-bydate/bayes-train-input -o
examples/bin/work/20news-bydate/bayes-model

 

5)Test the classifier

bin/mahout testclassifier -m examples/bin/work/20news-bydate/bayes-model -d
examples/bin/work/20news-bydate/bayes-test-input

 

Its reading my input test data files during test classification and I am
able to see the results @  http://pastebin.com/D5ejTwEW

 

But when I run the classification steps 4 and 5 as follows 

Step 4

$ bin/mahout trainclassifier -i
examples/bin/work/20news-bydate/bayes-train-input -o
examples/bin/work/20news-bydate/bayes-model-parameters -type bayes -ng 3
-source hdfs

Step 5

$ bin/mahout testclassifier -m
examples/bin/work/20news-bydate/bayes-model-parameters -d
examples/bin/work/20news-test-input -type bayes -ng 3 -method sequential
-source hdfs

 

Its not reading my input test data files and I get the results @
http://pastebin.com/eFazJRvU

 

 

Why classification is  not working when I pass parameters like  -type bayes
-ng 3 -source hdfs

 

Can some please help me out.

 

 

 

 

Thanks

Regards,

Divya 

 

 

 

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message