mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "leon lee (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAHOUT-476) bug when running org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver on hadoop
Date Fri, 13 Aug 2010 06:52:16 GMT
bug when running org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver on hadoop
-------------------------------------------------------------------------------------------

                 Key: MAHOUT-476
                 URL: https://issues.apache.org/jira/browse/MAHOUT-476
             Project: Mahout
          Issue Type: Bug
          Components: Classification
    Affects Versions: 0.3
         Environment: hadoop 0.20.2
mahout-0.3
ubuntu
            Reporter: leon lee


when I follow wiki instruction: https://cwiki.apache.org/MAHOUT/wikipedia-bayes-example.html

(by the way, the bayes examples document in wiki  need update to 0.3 )
to run step 5:
Create the countries based Split of wikipedia dataset. 

I use the following command:
$HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-0.3.job  org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver
-i $MAHOUT_HOME/examples/work/wikipedia/chunks -o $MAHOUT_HOME/examples/work/wikipediainput
 -c  $MAHOUT_HOME/examples/src/test/resources/country.txt

and failed on hadoop.
see hadoop log, it hint:
Error: org.apache.lucene.wikipedia.analysis.WikipediaTokenizer.addAttribute(Ljava/lang/Class;)Lorg/apache/lucene/util/Attribute

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message