hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kunsheng Chen <ke...@yahoo.com>
Subject Scaling inference on Hadoop DFS
Date Mon, 30 Nov 2009 21:16:26 GMT
Hi everyone,

Currently I got a MapReduce program that soring input records and Map-Reduce them to output
records with priority information for each of them. So far the program is running on 1 mainnode
and 3 datanodes. 

And I got data something like following:

number of records:   1000000 records
time to process:     100 seconds
input bytes :         20MB
number of datanodes:   3

I am wondering if I could make some assumption like "giving me 2000000 records" and the program
could finish that in "200 seconds" ?

Just any kind of feasibility of scability will be helpful, as it is important to my analysis
on the master thesis.

Any idea is well appreciated!




View raw message