hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Hardware Selection for Hadoop
Date Tue, 07 May 2013 12:45:24 GMT
I wouldn't go the route of multiple nics unless you are using MapR. 
MapR allows you to do port bonding  or rather use both ports simultaneously. 
When you port bond. 1+1 != 2 and then you have some other configuration issues. 
(Unless they've fixed them)

If this is your first cluster... keep it simple.  If your machine comes w 2 nic ports, use
one and then once you're an 'expurt',  turn on the second port. 

HTH

-Mike

On May 5, 2013, at 11:05 PM, Mohit Anchlia <mohitanchlia@gmail.com> wrote:

> Multiple NICs provide 2 benefits, 1) high availability 2) increases the network bandwidth
when using LACP type model.
> 
> On Sun, May 5, 2013 at 8:41 PM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com> wrote:
> OK. I do not know if I understand the spindle / core thing. I will dig more into that.
> 
> Thanks for the info. 
> 
> One more thing , whats the significance of multiple NIC.
> 
> Thanks,
> Rahul
> 
> 
> On Mon, May 6, 2013 at 12:17 AM, Ted Dunning <tdunning@maprtech.com> wrote:
> 
> Data nodes normally are also task nodes.  With 8 physical cores it isn't that unreasonable
to have 64GB whereas 24GB really is going to pinch.
> 
> Achieving highest performance requires that you match the capabilities of your nodes
including CPU, memory, disk and networking.  The standard wisdom is 4-6GB of RAM per core,
at least a spindle per core and 1/2 to 2/3 of disk bandwidth available as network bandwidth.
> 
> If you look at the different configurations mentioned in this thread, you will see different
limitations.
> 
> For instance:
> 
> 2 x Quad cores Intel
> 2-3 TB x 6 SATA         <==== 6 disk < desired 8 or more
> 64GB mem                <==== slightly larger than necessary
> 2 1GBe NICs teaming     <==== 2 x 100 MB << 400MB = 2/3 x 6 x 100MB
> 
> This configuration is mostly limited by networking bandwidth
> 
> 2 x Quad cores Intel
> 2-3 TB x 6 SATA         <==== 6 disk < desired 8 or more
> 24GB mem                <==== 24GB << 8 x 6GB
> 2 10GBe NICs teaming    <==== 2 x 1000 MB > 400MB = 2/3 x 6 x 100MB
>  
> This configuration is weak on disk relative to CPU and very weak on disk relative to
network speed.  The worst problem, however, is likely to be small memory.  This will likely
require us to decrease the number of slots by half or more making it impossible to even use
the 6 disks that we have and making the network even more outrageously over-provisioned.
>  
> 
> 
> 
> On Sun, May 5, 2013 at 9:41 AM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com> wrote:
> IMHO ,64 G looks bit high for DN. 24 should be good enough for DN.
> 
> 
> On Tue, Apr 30, 2013 at 12:19 AM, Patai Sangbutsarakum <Patai.Sangbutsarakum@turn.com>
wrote:
> 2 x Quad cores Intel
> 2-3 TB x 6 SATA
> 64GB mem
> 2 NICs teaming
> 
> my 2 cents
> 
> 
> On Apr 29, 2013, at 9:24 AM, Raj Hadoop <hadoopraj@yahoo.com>
>  wrote:
> 
>> Hi,
>>  
>> I have to propose some hardware requirements in my company for a Proof of Concept
with Hadoop. I was reading Hadoop Operations and also saw Cloudera Website. But just wanted
to know from the group - what is the requirements if I have to plan for a 5 node cluster.
I dont know at this time, the data that need to be processed at this time for the Proof of
Concept. So - can you suggest something to me?
>>  
>> Regards,
>> Raj
> 
> 
> 
> 
> 


Mime
View raw message