hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@apache.org>
Subject Re: Dedicated disk for operating system
Date Wed, 10 Aug 2011 19:04:50 GMT

On Aug 10, 2011, at 7:56 AM, Evert Lammerts wrote:

> A short, slightly off-topic question:
> 
>>      Also note that in this configuration that one cannot take
>> advantage of the "keep the machine up at all costs" features in newer
>> Hadoop's, which require that root, swap, and the log area be mirrored
>> to be truly effective.  I'm not quite convinced that those features are
>> worth it yet for anything smaller than maybe a 12 disk config.
> 
> Dell and Cloudera promote the C2100. I'd like to see the calculations behind that config.

	If Dell is shipping the same box they shipped us to test a few months ago, the performance
was pretty horrid vs. almost all their competitors.  The main problem was the controller--it
was built for RAID, not for JBOD.  (... and then there is the OOB support...)


> Am I wrong thinking that keeping your cluster up with such dense nodes will only work
if you have many (order of magnitude 100+) of them, and interconnected with 10Gb Ethernet?
If you don't then recovery times from failing disks / rack switches are going to get crazy,
right?

	If one assumes that a bunch of nodes are failing at once, yes.  The irony is that ops teams
tend to group repairs, so keeping them up might actually be the wrong thing in relation to
actual practice.

> If you want to get bang for buck, don't the proportions "disk IO / processing power",
"node storage capacity / ethernet speed" and "total amount of nodes / ethernet speed", indicate
many small nodes with not too many disks and 1Gb Ethernet?

	The biggest constraint is almost always RAM, as you can use it to help with the rest.


Mime
View raw message