hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bartosz M. Frak " <bar...@bnl.gov>
Subject Re: DataNode Hardware
Date Thu, 12 Jul 2012 20:20:58 GMT
Amandeep Khurana wrote:
> Inline. 
>
>
> On Thursday, July 12, 2012 at 12:56 PM, Bartosz M. Frak wrote:
>
>   
>> Quick question about data node hadrware. I've read a few articles, which 
>> cover the basics, including the Cloudera's recommendations here:
>> http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/
>>
>> The article is from early 2010, but I'm assuming that the general 
>> guidelines haven't deviated much from the recommended baselines. I'm 
>> skewing my build towards the "Compute optimized" side of the spectrum, 
>> which calls for a a 1:1 core to spindle model and more RAM for per node 
>> for in-memory caching.
>>
>>
>>     
>
> Why are you skewing more towards compute optimized. Are you expecting to run compute
intensive MR interacting with HBase tables? 
>   
Correct. We'll storing dense raw numerical time-based data, which will 
need to be transformed (decimated, FFTed, correlated, etc) with 
relatively low latency (under 10 seconds). We also expect repeatable 
reads, where the same piece of data is "looked" at more than once in a 
short amount of time. This is where we are hoping that in-memory caching 
and data node affinity can help us.
>> Other important consideration is low(ish) power 
>> consumption. With that in mind I had specced out the following (per node):
>>
>> Chassis: 1U Supermicro chassis with 2x 1Gb/sec ethernet ports 
>> (http://www.supermicro.com/products/system/1u/5017/sys-5017c-mtf.cfm) 
>> (~500USD)
>> Memory: 32GB Unbuffered ECC RAM (~280USD)
>> Disks: 4x2TBHitachi Ultrastar 7200RPM SAS Drives (~960USD)
>>
>>
>>     
>
> You can use plain SATA. Don't need SAS. 
>   
This is a government sponsored project, so some requirements (like MTBF 
and spindle warranty) for  are "set in stone", but I'll look into that.
>> CPU: 1x Intel E3-1230-v2 (3.3Ghz 4 Core / 8 Thread 69W) (~240USD)
>>
>>
>>     
>
> Consider getting dual hex core CPUs.
>   
I'm trying to avoid that for two reasons. Dual socket boards are (1) 
more expensive and (2) power hungry. Additionally the CPUs for those 
boards are also more expensive and less efficient than the one socket 
counterparts (take a look at Intel's E3 and E5 line pricing). The 
guidelines from the quited article state:

"Compute Intensive Configuration (2U/machine): Two quad core CPUs, 
48-72GB memory, and 8 disk drives (1TB or 2TB). These are often used 
when a combination of large in-memory models and heavy reference data 
caching is required."

My two 1U machines, which are equivalent to this remediations have 8 
(very fast, low wattage) cores, 64GB RAM and 8 2TB disks.

>> The backplane will consist of a dedicated high powered switch (not sure 
>> which one yet) with each node utilizing link aggregation.
>>
>> Does this look reasonable? We are looking into buying 4-5 of those for 
>> our initial test bench for under $10000 and plan to expand to about 
>> 50-100 nodes by next year.
>>
>>
>>     
>
>
>
>   


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message