mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: MatrixMultiplicationJob runs with 1 mapper only ?
Date Wed, 16 Jan 2013 07:53:29 GMT
It's up to Hadoop in the end.

Try calling FileInputFormat.setMaxInputSplitSize() with a smallish
value, like your 10MB (10000000).

I don't know if Hadoop params can be set as sys properties like that anyway?

On Wed, Jan 16, 2013 at 7:48 AM, Stuti Awasthi <> wrote:
> Hi,
> I am trying to multiple dense matrix of size [100 x 100k]. The size of the file is 104MB
and with default block sizeof 64MB only 2 blocks are getting created.
> So I reduced the block size to 10MB and now my file divided into 11 blocks across the
cluster. Cluster size is 10 nodes with 1 NN/JT and 9 DN/TT.
> Everytime Im running Mahout MatrixMultiplicationJob through commandline, I can see on
JobTracker WebUI that only 1 map task is launched. According to my understanding of Inputsplit,
there should be 11 map tasks launched.
> Apart from this Map task stays at 0.99% completion and in the Tasks Logs , I can see
that map task is spilling the map output.
> Mahout Command:
> mahout matrixmult -Dfs.inmemory.size.mb=200 -Dio.sort.factor=100
-Dio.sort.mb=200 -Dio.file.buffer.size=131072 --inputPathA /test/matrixA --numRowsA 100 --numColsA
100000 --inputPathB /test/matrixA --numRowsB 100 --numColsB 100000 --tempDir /test/temp
> Now here I want to know that why only 1 map task is launched everytime and how can I
performance tune the cluster so that I can perform the dense matrix multiplication of the
order [90K x 1 Million] .
> Thanks
> Stuti
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> The contents of this e-mail and any attachment(s) are confidential and intended for the
named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as information could
be intercepted, corrupted,
> lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The
e mail and its contents
> (with or without referred errors) shall therefore not attach any liability on the originator
or HCL or its affiliates.
> Views or opinions, if any, presented in this email are solely those of the author and
may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction, dissemination,
copying, disclosure, modification,
> distribution and / or publication of this message without the prior written consent of
authorized representative of
> HCL is strictly prohibited. If you have received this email in error please delete it
and notify the sender immediately.
> Before opening any email and/or attachments, please check them for viruses and other
> ----------------------------------------------------------------------------------------------------------------------------------------------------

View raw message