hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Jaggi <m.ja...@gmail.com>
Subject Realtime Map Reduce = Supercomputing for the Masses?
Date Sun, 01 Jun 2008 02:51:40 GMT
Concerning real-time Map Reduce within (and not only between) machines  
(multi-core & GPU), e.g. the Phoenix and Mars frameworks:

I'm really interested in very fast Map Reduce tasks, i.e. without much  
disk access. With the rise of multi-core systems, this could get more  
and more interesting, and could maybe even lead to something like  
'super-computing for everyone', or is that a bit overwhelming? Anyway  
I was nicely surprised to see the recent Phoenix (http://csl.stanford.edu/~christos/sw/phoenix/

) implementation of Map Reduce for multi-core CPUs (they won the best  
paper award at HPCA'07).

Recently also GPU computing was in the news again, pushed by Nvidia  
(check CUDA  http://www.nvidia.com/object/cuda_showcase.html ), and  
now also there a Map Reduce implementation called Mars became available:
http://www.cse.ust.hk/gpuqp/Mars_tr.pdf
The Mars people say a the end of their paper "We are also interested  
in integrating Mars into the existing Map Reduce implementations such  
as Hadoop so that the Map Reduce framework can take the advantage of  
the parallelism among different machines as well as the parallelism  
within each machine."

What do you think of this, especially about the multi-core approach?  
Do you think these needs are already served by the current  
InMemoryFileSystem of Hadoop or not? Are there any plans of  
'integrating' one of the two above frameworks?
Or would it already be done by improving the significant intermediate  
data pairs overhead (https://issues.apache.org/jira/browse/ 
HADOOP-3366 )?

Any comments?

Mime
View raw message