hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin Evans <co...@metaweb.com>
Subject Re: Question on DFS block placement and 'what is a rack' wrt DFS block placement
Date Tue, 12 Feb 2008 20:19:25 GMT
Because of acquiring servers of different capacities at different times, 
we have 2 servers with 1TB of disk each, and 11 servers with ~300GB 
each.  The 1TB servers tend to be under-utilized by HDFS given their 
capacity.  This makes sense, as block replicas need to be relatively 
evenly distributed across the cluster in order to allow tasks to be run 
close to data.  For out next cluster, we're going with uniform disk, 
CPU, and memory configurations. 

The big question for me is how well a dual-CPU 4-core (8 cores per box) 
configuration will do.  Has anyone tried out this configuration with 
Intel or AMD CPUs?  Is the memory throughput sufficient?

Jason Venner wrote:
> We are starting to build larger clusters, and want to better 
> understand how to configure the network topology.
> Up to now we have just been setting up a private vlan for the small 
> clusters.
> We have been thinking about the following machine configurations
> Compute nodes with a number of spindles and medium disk, that also 
> serve DFS
> For every 4-8 of the above, one compute node with a large number of 
> spindles with a large number of disks, to bulk out th DFS capacity.
> We are wondering what the best practices are for network topology in 
> clusters that are built out of the above building blocks.
> We can readily have 2 or 4 network cards in each node.

View raw message