hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From He Chen <airb...@gmail.com>
Subject Re: Matrix multiplication in Hadoop
Date Sat, 19 Nov 2011 17:02:58 GMT
Did you try Hama?

There are may methods.

1) use Hadoop MPI which allows you use MPI MM code based on Hadoop;

2) Hama is designed for MM

3) Use pure Hadoop Java MapReduce;

I did this before but may not be optimal algorithm. Put your first matrix
in DistributedCache and take second matrix line as inputsplit. For each
line, use a mapper to let a array multply the first matrix in
DistributedCache. Use reducer to collect the result matrix. This algorithm
is limited by your DistributedCache size. It is suitable for a small matrix
to multiply a huge matrix.

Chen
On Sat, Nov 19, 2011 at 10:34 AM, Tim Broberg <Tim.Broberg@exar.com> wrote:

> Perhaps this is a good candidate for a native library, then?
>
> ________________________________________
> From: Mike Davis [xmikedavis@gmail.com]
> Sent: Friday, November 18, 2011 7:39 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Matrix multiplication in Hadoop
>
>  On Friday, November 18, 2011, Mike Spreitzer <mspreitz@us.ibm.com> wrote:
> >  Why is matrix multiplication ill-suited for Hadoop?
>
> IMHO, a huge issue here is the JVM's inability to fully support cpu vendor
> specific SIMD instructions and, by extension, optimized BLAS routines.
> Running a large MM task using intel's MKL rather than relying on generic
> compiler optimization is orders of magnitude faster on a single multicore
> processor. I see almost no way that Hadoop could win such a CPU intensive
> task against an mpi cluster with even a tenth of the nodes running with a
> decently tuned BLAS library. Racing even against a single CPU might be
> difficult, given the i/o overhead.
>
> Still, it's a reasonably common problem and we shouldn't murder the good in
> favor of the best. I'm certain a MM/LinAlg Hadoop library with even
> mediocre performance, wrt C, would get used.
>
> --
> Mike Davis
>
> The information and any attached documents contained in this message
> may be confidential and/or legally privileged.  The message is
> intended solely for the addressee(s).  If you are not the intended
> recipient, you are hereby notified that any use, dissemination, or
> reproduction is strictly prohibited and may be unlawful.  If you are
> not the intended recipient, please contact the sender immediately by
> return e-mail and destroy all copies of the original message.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message