YIMEN YIMGA Gael
Subject RE: Need to evaluate a cluster
Date Thu, 10 Jul 2014 08:35:01 GMT
```Hi,

To Mirko
The number of HDDs per datanodes is : 3 (3 disks of 1TB to 3TB)

I calculate the number of nodes using the following formulae

=======

-       Used space on the cluster by daily feed : <daily feed> * <replication factor>
= 720GB * 3 = 2160GB

-       Size of a disk for HDFS : <Size of a disk> * (1 – <booked space for each
disk out HDFS>) = 3TB * (1 – 30%) = 2.1TB

-       Number of datanodes in a year (without monthly data increasing) : <used space on
the cluster by daily feed> * 365 / <size of a disk for HDFS> = 2160GB * 365/1024*2.1
= 367 datanodes
=======

To Olivier

About compression of data, No, I assumed data will not be compressed.
How to use compression ratio in my calculation ?

Standing by

From: Olivier Renault
Sent: Wednesday 9 July 2014 18:51
Subject: Re: Need to evaluate a cluster

Is your data already compressed? If it's not you can safely assume a compression ratio of
5.

Olivier
Mirko Kämpf
wrote:
Hello,

if I follow your numbers I see one missing fact: What is the number of HDDs per DataNode?
Let's assume you use machines with 6 x 3TB HDDs per box, you would need about 60 DataNodes
per year (0.75 TB per day x 3 for replication x 1.3 for overhead / ( nr of HDDs per node x
capacity per HDD )).
With 12 HDD you would only need 30 servers per year.
How did you calculate the number of 367 datanodes?

Cheers,
Mirko

YIMEN YIMGA Gael
Hello Dear,

I made an estimation of a number of nodes of a cluster that can be supplied by 720GB of data/day.
My estimation gave me 367 datanodes in a year. I’m a bit afraid by that amount of datanodes.
The assumptions, I used are the followings :

-          Daily supply (feed) : 720GB

-          HDFS replication factor: 3

-          Booked space for each disk outside HDFS: 30%

-          Size of a disk: 3TB.

I have two questions.

First, I would like to know if my assumptions are well taken?
Secondly, could someone help me to evaluate that cluster, to let me be sure that my results
are not to excessive, please ?

Warm regard

