hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Broberg <Tim.Brob...@exar.com>
Subject RE: Matrix multiplication in Hadoop
Date Sat, 19 Nov 2011 16:34:54 GMT
Perhaps this is a good candidate for a native library, then?

From: Mike Davis [xmikedavis@gmail.com]
Sent: Friday, November 18, 2011 7:39 PM
To: common-user@hadoop.apache.org
Subject: Re: Matrix multiplication in Hadoop

On Friday, November 18, 2011, Mike Spreitzer <mspreitz@us.ibm.com> wrote:
>  Why is matrix multiplication ill-suited for Hadoop?

IMHO, a huge issue here is the JVM's inability to fully support cpu vendor
specific SIMD instructions and, by extension, optimized BLAS routines.
Running a large MM task using intel's MKL rather than relying on generic
compiler optimization is orders of magnitude faster on a single multicore
processor. I see almost no way that Hadoop could win such a CPU intensive
task against an mpi cluster with even a tenth of the nodes running with a
decently tuned BLAS library. Racing even against a single CPU might be
difficult, given the i/o overhead.

Still, it's a reasonably common problem and we shouldn't murder the good in
favor of the best. I'm certain a MM/LinAlg Hadoop library with even
mediocre performance, wrt C, would get used.

Mike Davis

The information and any attached documents contained in this message
may be confidential and/or legally privileged.  The message is
intended solely for the addressee(s).  If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful.  If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.

View raw message