hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bhagaban Khatai <email.bhaga...@gmail.com>
Subject Re: Cluster sizing
Date Fri, 29 May 2015 07:12:23 GMT
Thanks Ashish for your help.

We dont have any clear picture and we are approching few clients on this
and many clients are asking for configuration on cluster size.

one customer gave us few requirement like, they will process 100TB data per
day and we need to come up with nodes details and how much memory and core
required for this. note: 100TB is without relication.

We may be going with Cloudera.

Please suggest.

On Fri, May 29, 2015 at 11:45 AM, Ashish Kumar9 <ashishk4@in.ibm.com> wrote:

> Can you share some more inputs on requirement .
>
> What is the analytics usecase ? ( Batch Processing , Real Time , In-Memory
> Requirements )
> Which distribution of Hadoop ?
> What is the storage growth rate ?
> What are the data ingest requirements ?
> What kind of jobs will run on the cluster ?
> What is the nature of data ? Is data compression applicable ?
> What is the HA requirements ? What is the performance expectations ?
>
> Based on these requirements , you would have to design compute , storage
> and also network elements
>
> Thanks and Regards,
> Ashish Kumar
> IBM Systems BigData Analytics Solutions Architect
>
>
> From:        Bhagaban Khatai <email.bhagaban@gmail.com>
> To:        user@hadoop.apache.org
> Date:        05/29/2015 11:32 AM
> Subject:        Cluster sizing
> ------------------------------
>
>
>
> Hi,
>
> I wanted to know how I can determine how many nodes with cores/storage in
> TB and RAM needed, if I will receieve the data volume increase from 1TB to
> 100TB per day. Can someone help me here to create a excel based on this.
>
> Thanks
>

Mime
View raw message