mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahmood Naderan <nt_mahm...@yahoo.com>
Subject Re: trainclassifier/trainnb
Date Tue, 25 Mar 2014 16:08:34 GMT
OK. The wikipedia example in these pages

http://mahout.apache.org/users/classification/wikipedia-bayes-example.html
https://cwiki.apache.org/confluence/display/MAHOUT/Wikipedia+Bayes+Example

Are valid for older Mahouts. I have no problem with Mahout 0.6. However the last two commands
(trainclassifier and testclassifier) are not valid on Mahout 0.9.  For example, trainnb need
a label option but I don't know what it is because trainclassifier has no such option.

As I said the example is runnable under Mahout 0.6. If you confirm that Mahout 0.6 and 0.9
have no difference regarding the Wikipedia Bayes, then I will forget 0.9. I see many changes
in the changelog.

 
Regards,
Mahmood



On Tuesday, March 25, 2014 6:19 PM, Suneel Marthi <suneel_marthi@yahoo.com> wrote:
 
If u r looking for an example usage, see examples/bin/classify-20newsgroups.sh



Sent from my iPhone


> On Mar 25, 2014, at 9:28 AM, Andrew Musselman <andrew.musselman@gmail.com> wrote:
> 
> If you need to see which options are available for a given job you can just
> run $MAHOUT_HOME/bin/mahout jobname to see the usage:
> 
> $ bin/mahout trainnb
> Running on hadoop, using /home/user/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB:
> /home/user/mahout/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar
> 14/03/25 06:24:22 WARN driver.MahoutDriver: No trainnb.props found on
> classpath, will use command-line arguments only
> 14/03/25 06:24:23 ERROR common.AbstractJob: No input specified or
> -Dmapred.input.dir must be provided to specify input directory
> Usage:
> 
> [--input <input> --output <output> --labels <labels> --extractLabels
> --alphaI
> <alphaI> --trainComplementary --labelIndex <labelIndex> --overwrite --help
> 
> --tempDir <tempDir> --startPhase <startPhase> --endPhase <endPhase>]
> 
> Job-Specific Options:
> 
>  --input (-i) input               Path to job input directory.
> 
>  --output (-o) output             The directory pathname for output.
> 
>  --labels (-l) labels             comma-separated list of labels to
> include in
>                                   training
> 
>  --extractLabels (-el)            Extract the labels from the input
> 
>  --alphaI (-a) alphaI             smoothing parameter
> 
>  --trainComplementary (-c)        train complementary?
> 
>  --labelIndex (-li) labelIndex    The path to store the label index in
> 
>  --overwrite (-ow)                If present, overwrite the output
> directory
>                                   before running job
> 
>  --help (-h)                      Print out help
> 
>  --tempDir tempDir                Intermediate output directory
> 
>  --startPhase startPhase          First phase to run
> 
>  --endPhase endPhase              Last phase to run
> 
> 
> On Tue, Mar 25, 2014 at 3:17 AM, Mahmood Naderan <nt_mahmood@yahoo.com>wrote:
> 
>> Hi,
>> What is the correct syntax for this old command?
>> 
>>   mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4
>> 
>> It seems that trainclassifier is replaced by trainnb but this one has no
>> -mf option.
>> 
>> 
>> Regards,
>> Mahmood
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message