mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Said <Alan.S...@dai-labor.de>
Subject Job with mulitple input paths and mappers
Date Mon, 29 Nov 2010 17:54:52 GMT
Hi all,
I'm trying to get https://issues.apache.org/jira/browse/MAHOUT-106 running in Mahout 0.4 and
Hadoop 0.20.2. I'm however stuck at a point where a job with multiple input paths and mappers
is created, as show in the code below.

    MultipleInputs.addInputPath(psz, new Path(sumSUQStarPath).makeQualified(fsPsz), SequenceFileInputFormat.class,
Psz.PszSumSUQStarMapper.class);
    MultipleInputs.addInputPath(psz, new Path(sumUQStarPath).makeQualified(fsPsz), SequenceFileInputFormat.class,
Psz.PszSumUQStarMapper.class);

    prepareJobConfWithMultipleInputs(psz,
                                         pszNextPath,
                                         VarIntWritable.class,
                                         LongFloatWritable.class,
                                         Psz.PszReducer.class,
                                         VarLongWritable.class,
                                         IntFloatWritable.class,
                                         SequenceFileOutputFormat.class);
    JobClient.runJob(psz);

I'm not quite sure how this should be written for the current API's.
AbstractJob's current prepareJob method can handle multiple input paths via org.apache.hadoop.fs.Path,
not sure how to do with the extra mapper though.

Any help would be appreciated.

Thanks,
Alan



Mime
View raw message