hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rita <rmorgan...@gmail.com>
Subject Re: Sizing help
Date Tue, 08 Nov 2011 00:34:32 GMT
I have been running with 2x replication on a 500tb cluster. No issues
whatsoever. 3x is for super paranoid.


On Mon, Nov 7, 2011 at 5:06 PM, Ted Dunning <tdunning@maprtech.com> wrote:

> Depending on which distribution and what your data center power limits are
> you may save a lot of money by going with machines that have 12 x 2 or 3 tb
> drives.  With suitable engineering margins and 3 x replication you can have
> 5 tb net data per node and 20 nodes per rack.  If you want to go all cowboy
> with 2x replication and little space to spare then you can double that
> density.
>
> On Monday, November 7, 2011, Rita <rmorgan466@gmail.com> wrote:
> > For a 1PB installation you would need close to 170 servers with 12 TB
> disk pack installed on them (with replication factor of 2). Thats a
> conservative estimate
> > CPUs: 4 cores with 16gb of memory
> >
> > Namenode: 4 core with 32gb of memory should be ok.
> >
> >
> > On Fri, Oct 21, 2011 at 5:40 PM, Steve Ed <sedison70@gmail.com> wrote:
> >>
> >> I am a newbie to Hadoop and trying to understand how to Size a Hadoop
> cluster.
> >>
> >>
> >>
> >> What are factors I should consider deciding the number of datanodes ?
> >>
> >> Datanode configuration ?  CPU, Memory
> >>
> >> Amount of memory required for namenode ?
> >>
> >>
> >>
> >> My client is looking at 1 PB of  usable data and will be running
> analytics on TB size files using mapreduce.
> >>
> >>
> >>
> >>
> >>
> >> Thanks
> >>
> >> ….. Steve
> >>
> >>
> >
> >
> > --
> > --- Get your facts first, then you can distort them as you please.--
> >
>



-- 
--- Get your facts first, then you can distort them as you please.--

Mime
View raw message