hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cosmin Lehene <cleh...@adobe.com>
Subject Re: One petabyte of data loading into HDFS with in 10 min.
Date Wed, 05 Sep 2012 14:43:14 GMT
Here's an extremely naïve ballpark estimation: at theoretical hardware speed, for 3PB representing
1PB with 3x replication

Over a single 1Gbps connection (and I'm not sure, you can actually reach 1Gbps)
(3 petabytes) / (1 Gbps) = 291.271111 days

So you'd need at least 40,000 1Gbps network cards to get that in 10 minutes :) - (3PB/1Gbps)/40000<http://www.google.ro/search?client=safari&rls=en&q=(3PB/1Gbps)/40000&ie=UTF-8&oe=UTF-8&redir_esc=&ei=2WRHUNWtGIWo0QW52oDYDw>

The actual number of nodes would depend a lot on the actual network architecture, the type
of storage you use (SSD,  HDD), etc.

Cosmin
From: prabhu K <prabhu.hadoop@gmail.com<mailto:prabhu.hadoop@gmail.com>>
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Date: Wednesday, September 5, 2012 3:21 PM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: One petabyte of data loading into HDFS with in 10 min.

Hi Users,

Please clarify the below questions.

1. With in 10 minutes one petabyte of data load into HDFS/HIVE , how many slave (Data Nodes)
machines required.

2. With in 10 minutes one petabyte of data load into HDFS/HIVE, what is the configuration
setup for cloud computing.

Please suggest and help me on this.

Thanks&Regards,
Prabhu.


Mime
View raw message