hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Smith <smith.d.ja...@gmail.com>
Subject Any projects to help with running MapReduce across physically distributed clusters?
Date Wed, 03 Nov 2010 19:18:18 GMT
I am looking into the problem of running jobs to generate statistics across
a large data set that would be split into different clusters
geographically.  Each cluster would have a unique piece of the overall data
set, as the network overhead to collocate the data would be too much. I
tried searching around for any tools that might help orchestrate something
like this, but did not find anything. Are there any tools I'm missing that I
should look into to?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message