hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shashidhar Rao <raoshashidhar...@gmail.com>
Subject Time taken to do a word count on 10 TB data.
Date Mon, 14 Apr 2014 17:57:01 GMT
Hi,

Can somebody provide me a rough estimate of the time taken in hours/mins
for a cluster of say 30 nodes to run a map reduce job to perform a word
count on say 10 TB of data, assuming that the hardware and the map reduce
program is tuned optimally.

Just a rough estimate, it could be 5TB,10 TB or 20 TB data. If not word
count it could be just to analyze the above size of data.

Regards
Shashidhar

Mime
View raw message