hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Davis <xmikeda...@gmail.com>
Subject Re: Matrix multiplication in Hadoop
Date Sat, 19 Nov 2011 03:39:40 GMT
On Friday, November 18, 2011, Mike Spreitzer <mspreitz@us.ibm.com> wrote:
>  Why is matrix multiplication ill-suited for Hadoop?

IMHO, a huge issue here is the JVM's inability to fully support cpu vendor
specific SIMD instructions and, by extension, optimized BLAS routines.
Running a large MM task using intel's MKL rather than relying on generic
compiler optimization is orders of magnitude faster on a single multicore
processor. I see almost no way that Hadoop could win such a CPU intensive
task against an mpi cluster with even a tenth of the nodes running with a
decently tuned BLAS library. Racing even against a single CPU might be
difficult, given the i/o overhead.

Still, it's a reasonably common problem and we shouldn't murder the good in
favor of the best. I'm certain a MM/LinAlg Hadoop library with even
mediocre performance, wrt C, would get used.

Mike Davis

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message