mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Divya" <>
Subject classification example queries
Date Wed, 24 Nov 2010 02:07:00 GMT


I am following  to
run the 20 news example for classification 


I have few queries regarding that :


When I follow below steps 

1) Extract dataset

  tar zxf 20news-bydate.tar.gz 


2)generate input data set with to train the classifier

$ bin/mahout org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups  -p

e-train -o D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-train-input
-a org.apache.mahout.vectorizer.DefaultAnalyzer  -c UTF-8


3)Generate train input data set to test the classifier 

bin/mahout org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups  -p

-o D:/mahout-0.4/examples/bin/work/20news-bydate/bayes-test-input -a
org.apache.mahout.vectorizer.DefaultAnalyzer  -c UTF-8


4)Train the classifier

bin/mahout trainclassifier -i
examples/bin/work/20news-bydate/bayes-train-input -o


5)Test the classifier

bin/mahout testclassifier -m examples/bin/work/20news-bydate/bayes-model -d


Its reading my input test data files during test classification and I am
able to see the results @


But when I run the classification steps 4 and 5 as follows 

Step 4

$ bin/mahout trainclassifier -i
examples/bin/work/20news-bydate/bayes-train-input -o
examples/bin/work/20news-bydate/bayes-model-parameters -type bayes -ng 3
-source hdfs

Step 5

$ bin/mahout testclassifier -m
examples/bin/work/20news-bydate/bayes-model-parameters -d
examples/bin/work/20news-test-input -type bayes -ng 3 -method sequential
-source hdfs


Its not reading my input test data files and I get the results @



Why classification is  not working when I pass parameters like  -type bayes
-ng 3 -source hdfs


Can some please help me out.













  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message