hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shashidhar Rao <raoshashidhar...@gmail.com>
Subject Time taken to do a word count on 10 TB data.
Date Mon, 14 Apr 2014 17:57:01 GMT

Can somebody provide me a rough estimate of the time taken in hours/mins
for a cluster of say 30 nodes to run a map reduce job to perform a word
count on say 10 TB of data, assuming that the hardware and the map reduce
program is tuned optimally.

Just a rough estimate, it could be 5TB,10 TB or 20 TB data. If not word
count it could be just to analyze the above size of data.


View raw message