mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kris Jack <>
Subject Re: Setting Number of Mappers and Reducers in DistributedRowMatrix Jobs
Date Mon, 14 Jun 2010 14:39:36 GMT
Hi Sean,

Yes, I tried using those parameters but they didn't seem to have any
effect.  What's more, the number of reducers never increased above 1,
meaning that I never got to see any results when running with large data
sets (doing matrix multiplication).

I looked in the code to find where these parameters were being read by the
jobs that I was using (i.e. MatrixMultiplicationJob and TransposeJob) but
couldn't find them.  As a result, I modified their builders and called the
setNumMapTasks and setNumReducerTasks functions from the conf objects.  This
now works from the command line using the parameters that you suggested.

Please do let me know if I was just not calling them correctly or if you
think that there already exists an alternative way to do this.  I would like
to use Mahout as it was intended and not make lots of little changes myself
if they aren't necessary.


2010/6/11 Sean Owen <>

> and same for reduce? These should be Hadoop params
> you set directly to Hadoop.
> On Fri, Jun 11, 2010 at 5:07 PM, Kris Jack <> wrote:
> > Hi everyone,
> >
> > I am running code that uses some of the jobs defined in the
> > DistributedRowMatrix class and would like to know if I can define the
> number
> > of mappers and reducers that they use when running?  In particular, with
> the
> > jobs:
> >
> > - MatrixMultiplicationJob
> > - TransposeJob
> >
> > I am happy to comfortable with changing the code to get this to work but
> I
> > was wondering if the algorithmic logic being employed would allow
> multiple
> > mappers and reducers.
> >
> > Thanks,
> > Kris
> >

Dr Kris Jack,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message