hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Stretched HDFS cluster
Date Thu, 17 Sep 2009 16:54:47 GMT
Edward Capriolo wrote:

> On a somewhat related topic I was showing a co-worker a Hadoop setup
> and he asked stated, "What if we got a bunch of laptops on the
> internet like the playstation 'Folding @ Home'" of course these are
> widely different distributed models.
> I have been thinking about this. Assume:
> Throw the data security out the window, and assume everyone is playing fair.
> Assume we have systems with a semi-dedicated IP, like my cable
> internet. with no inbound/outbound restrictions.
> Assume every computer is its own RACK
> LAN is very low latency Assume that latency is like 40 ms
> Assume we tune up replication to say 10 or higher to deal with drop on/drop offs
> Could we run a 100 Node cluster? If no what is stopping it from working?
> My next question. For fun, does anyone want to try? We could setup
> IPTABLES/firewall allowing hadoop traffic from IP's in the experiment.
> I have two nodes in Chicago, US ISP to donate. Even if we get 10 nodes
> that would be interesting as a benchmark.

you could see about getting planet-lab time for something like this, 
though the way they allocate partially virtualized slices, you can't be 
100% sure that you get the ports you ask for. This will complicate binding.

View raw message