hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Namenode Scalability
Date Wed, 17 Aug 2011 10:46:10 GMT
On 17/08/11 08:48, Dieter Plaetinck wrote:
> Hi,
> On Wed, 10 Aug 2011 13:26:18 -0500
> Michel Segel<michael_segel@hotmail.com>  wrote:
>> This sounds like a homework assignment than a real world problem.
> Why? just wondering.

The question  proposed a data rate comparable with Yahoo, Google and 
Facebook --yet it was ingress rather than egress, which was even more 
unusual. You'd have to be doing a web-scale search engine to need that 
data rate -and if you were doing that you need to know a lot more about 
how Hadoop works (i.e. the limited role of the NN). You'd also have to 
addressed the entire network infrastructure, the costs of the work on 
your external system, DNS load, power budget. Oh, and the fact that 
unless you were processing discarding those PB/day at the rate of 
ingress, you'd need to add a new Hadoop cluster at a rate of 1 
cluster/month, which is not only expensive, I don't think datacentre 
construction rates could handle it, even if your server vendor had set 
up a construction/test pipeline to ship down an assembled and test 
containerised cluster every few weeks (which we can do, incidentally :)

>> I guess people don't race cars against trains or have two trains
>> traveling in different directions anymore... :-)
> huh?

Different Homework questions.

View raw message