hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: hdfs > 100T?
Date Thu, 10 Apr 2008 20:36:16 GMT

I should mention that the mogile available generally is not suitable for
large installs.

We had to make significant changes to get it to work correctly.  We are
figuring out how to contribute these back, but may have to fork the project
to do it.


On 4/10/08 12:21 PM, "Todd Troxell" <ttroxell@debian.org> wrote:

> On Thu, Apr 10, 2008 at 09:18:02AM -0700, Ted Dunning wrote:
>> Hadoop also does much better with spindles spread across many machines.
>> Putting 16 TB on each of two nodes is distinctly sub-optimal on many fronts.
>> Much better to put 0.5-2TB on 16-64 machines.  With 2x1TB SATA drives, your
>> cost and performance are likely to both be better than two machines with
>> storage trays (aggressive pricing right now on minimal machines with 16TB in
>> two storage trays from a major vendor is about 18K$, you should be able to
>> populate a 1U node with 2TB of disk for about $1500.  16 x 1.5K% = 24K$ <
>> 2x18K$).  The rack space requirements are about the same, but you may have
>> slightly lower power for the tray solutions.
>> 
>> On the other hand, your performance requirements are so low that you might
>> just as well off getting something like a Sun Thumper that can accommodate
>> all of your storage in a single chassis.
>> 
>> We use a mixture of both kinds of solution in our system.  We have nearly a
>> billion files stored on tray based machines using mogileFS.  One scaling
>> constraint there is the simply the management and configuration of nodes so
>> fewer machines is a small win.  We also have a modest number of TB's in a
>> more traditional hadoop cluster with small machines.
> 
> Thanks for the input.  I had been considering mogilefs as well until
> recently but I have enough machines to do any serious benchmarking of
> configurations.  This post has been very helpful.


Mime
View raw message