mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Divya" <di...@k2associates.com.sg>
Subject RE: NPE in bayes wiki example
Date Mon, 29 Nov 2010 07:24:58 GMT
Hi,

Steps I followed are below :

$  bin/mahout wikipediaDataSetCreator  -i
D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/Traininput
-o examples/bi
n/work/wikipedia/wikipediaClassification/train-subject -c
$MAHOUT_HOME/examples/src/test/resources/subjects.txt

$  bin/mahout wikipediaDataSetCreator  -i
D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/Testinput
-o examples/bin
/work/wikipedia/wikipediaClassification/test-subject -c
$MAHOUT_HOME/examples/src/test/resources/subjects.txt

$ bin/mahout trainclassifier -i
examples/bin/work/wikipedia/wikipediaClassification/train-subject -o
examples/bin/work/wikipedia/wikip
ediaClassification/wikipedia-subject-model

$ bin/mahout testclassifier -m
examples/bin/work/wikipedia/wikipediaClassification/wikipedia-subject-model
-d examples/bin/work/wikipedia/wikipediaClassification/test-subject


Regards,
Divya



-----Original Message-----
From: Grant Ingersoll [mailto:gsingers@apache.org] 
Sent: Saturday, November 27, 2010 8:54 PM
To: user@mahout.apache.org
Subject: Re: NPE in bayes wiki example

Can you provide all the steps you have done up to this point?

-Grant

On Nov 25, 2010, at 12:57 AM, Divya wrote:

> Hi,
> 
> I am getting null pointer exception when I pass my test input data to
> testclassifier 
> 
> 
> 
> $ bin/mahout testclassifier -m
>
examples/bin/work/wikipedia/wikipediaClassification/wikipedia-subject-model
> -d examples/bin/work/wikipe
> 
> dia/wikipediaClassification/test-subject
> 
> Running on hadoop, using HADOOP_HOME=C:\cygwin\home\Divya\hadoop-0.20.2
> 
> HADOOP_CONF_DIR=C:\cygwin\home\Divya\hadoop-0.20.2\conf
> 
> 10/11/25 13:51:36 INFO bayes.TestClassifier: Loading model from:
> {basePath=examples/bin/work/wikipedia/wikipediaClassification/wikipedi
> 
> a-subject-model, classifierType=bayes, alpha_i=1.0, dataSource=hdfs,
> gramSize=1, verbose=false, encoding=UTF-8, defaultCat=unknown, tes
> 
> tDirPath=examples/bin/work/wikipedia/wikipediaClassification/test-subject}
> 
> 10/11/25 13:51:36 INFO bayes.TestClassifier: Testing Bayes Classifier
> 
> 10/11/25 13:51:38 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-weights/Sigma_j/part-00000
> 
> 10/11/25 13:51:38 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-weights/Sigma_k/part-00000
> 
> 10/11/25 13:51:38 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-weights/Sigma_kSigma_j/part-00000
> 
> 10/11/25 13:51:38 INFO io.SequenceFileModelReader: 8.048212844092422
> 
> 10/11/25 13:51:39 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-thetaNormalizer/part-00000
> 
> 10/11/25 13:51:39 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-tfIdf/trainer-tfIdf/part-00000
> 
> 10/11/25 13:51:39 INFO datastore.InMemoryBayesDatastore: history
> -23722.080627413125 23722.080627413125 -1.0
> 
> Exception in thread "main" java.lang.NullPointerException
> 
>        at
>
org.apache.mahout.classifier.ConfusionMatrix.getCount(ConfusionMatrix.java:1
> 02)
> 
>        at
>
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.
> java:118)
> 
>        at
>
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.
> java:122)
> 
>        at
>
org.apache.mahout.classifier.ConfusionMatrix.addInstance(ConfusionMatrix.jav
> a:90)
> 
>        at
>
org.apache.mahout.classifier.ResultAnalyzer.addInstance(ResultAnalyzer.java:
> 68)
> 
>        at
>
org.apache.mahout.classifier.bayes.TestClassifier.classifySequential(TestCla
> ssifier.java:266)
> 
>        at
>
org.apache.mahout.classifier.bayes.TestClassifier.main(TestClassifier.java:1
> 86)
> 
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 
>        at
>
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
> 
>        at
>
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
> 
>        at java.lang.reflect.Method.invoke(Method.java:597)
> 
>        at
>
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver
> .java:68)
> 
>        at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> 
>        at
org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
> 
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 
>        at
>
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
> 
>        at
>
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
> 
>        at java.lang.reflect.Method.invoke(Method.java:597)
> 
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> 
> 
> 
> My classifier is subjects.txt which has two entries History and Science.
> 
> 
> 
> 
> 
> 
> 
> but when I pass train input data I get to see the results 
> 
> 
> 
> $ bin/mahout testclassifier -m
>
examples/bin/work/wikipedia/wikipediaClassification/wikipedia-subject-model
> -d examples/bin/work/wikipe
> 
> dia/wikipediaClassification/train-subject
> 
> Running on hadoop, using HADOOP_HOME=C:\cygwin\home\Divya\hadoop-0.20.2
> 
> HADOOP_CONF_DIR=C:\cygwin\home\Divya\hadoop-0.20.2\conf
> 
> 10/11/25 13:51:54 INFO bayes.TestClassifier: Loading model from:
> {basePath=examples/bin/work/wikipedia/wikipediaClassification/wikipedi
> 
> a-subject-model, classifierType=bayes, alpha_i=1.0, dataSource=hdfs,
> gramSize=1, verbose=false, encoding=UTF-8, defaultCat=unknown, tes
> 
>
tDirPath=examples/bin/work/wikipedia/wikipediaClassification/train-subject}
> 
> 10/11/25 13:51:54 INFO bayes.TestClassifier: Testing Bayes Classifier
> 
> 10/11/25 13:51:55 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-weights/Sigma_j/part-00000
> 
> 10/11/25 13:51:55 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-weights/Sigma_k/part-00000
> 
> 10/11/25 13:51:55 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-weights/Sigma_kSigma_j/part-00000
> 
> 10/11/25 13:51:55 INFO io.SequenceFileModelReader: 8.048212844092422
> 
> 10/11/25 13:51:55 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-thetaNormalizer/part-00000
> 
> 10/11/25 13:51:55 INFO io.SequenceFileModelReader:
>
file:/D:/mahout-0.4/examples/bin/work/wikipedia/wikipediaClassification/wiki
> pedia-su
> 
> bject-model/trainer-tfIdf/trainer-tfIdf/part-00000
> 
> 10/11/25 13:51:55 INFO datastore.InMemoryBayesDatastore: history
> -23722.080627413125 23722.080627413125 -1.0
> 
> 10/11/25 13:51:55 INFO bayes.TestClassifier: Classified instances from
> part-r-00000
> 
> 10/11/25 13:51:55 INFO bayes.TestClassifier:
> =======================================================
> 
> Summary
> 
> -------------------------------------------------------
> 
> Correctly Classified Instances          :          2           100%
> 
> Incorrectly Classified Instances        :          0             0%
> 
> Total Classified Instances              :          2
> 
> 
> 
> =======================================================
> 
> Confusion Matrix
> 
> -------------------------------------------------------
> 
> a       <--Classified as
> 
> 2        |  2           a     = history
> 
> Default Category: unknown: 1
> 
> 
> 
> 
> 
> 10/11/25 13:51:55 INFO driver.MahoutDriver: Program took 953 ms
> 
> 
> 
> 
> 
> Can someone please explain the reason behind it.
> 
> 
> 
> Thanks
> 
> Regards,
> 
> Divya 
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem docs using Solr/Lucene:
http://www.lucidimagination.com/search



Mime
View raw message