mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: seqdirectory doesn't seem to be generating seqfiles...?
Date Fri, 24 Feb 2012 03:47:13 GMT
What does this do? And is it what you want?

org.apache.mahout.text.PrefixAdditionFilter

You can run these apps from inside Eclipse/IntelliJ, and single step
where it walks files.

On Wed, Feb 22, 2012 at 7:01 PM, Temese Szalai <temeseszalai@gmail.com> wrote:
> Hello -
>
> I'm new to Mahout and I'm not having any luck trying to use seqdirectory to
> create seqfiles so that i can then generate vectors from text files.
> Seems like this operation should work like a charm.
>
> Here is the command that I used to attempt to process the Reuters corpus
> into seqfiles and the output that I got in the terminal.
>
> *$ bin/mahout seqdirectory -c UTF-8 -i examples/reuters-extracted/ -o
> reuters-seqfiles*
> *Running on hadoop, using
> HADOOP_HOME=/Users/temeseszalai/Desktop/hadoop-0.20.203.0*
> *No HADOOP_CONF_DIR set, using
> /Users/temeseszalai/Desktop/hadoop-0.20.203.0/src/conf *
> *12/02/22 16:29:01 INFO common.AbstractJob: Command line arguments:
> {--charset=UTF-8, --chunkSize=64, --endPhase=2147483647,
> --fileFilterClass=org.apache.mahout.text.PrefixAdditionFilter,
> --input=examples/reuters-extracted/, --keyPrefix=,
> --output=reuters-seqfiles, --startPhase=0, --tempDir=temp}*
> *12/02/22 16:29:02 INFO driver.MahoutDriver: Program took 418 ms*
>
> I am using mahout-distribution-0.5 on Mac OSX (10.7.3).
> I don't get any error messages from seqdirectory. I just don't get any
> seqfiles.
>
> the output directory is always empty and the time it takes to run is always
> minimal.. have tried with different data, different paths, have had someone
> else with
> considerably more java experience sanity check and still no luck.
>
> I'm clearly doing something wrong ... No idea what ... I've tried poking
> around to see if anyone else has had the same issue and haven't turned up
> much that is useful.
>
> Any thoughts? Guidance would definitely be appreciated.
>
> Thanks in advance.
> Temese



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message