Hi, NguyenHuynh
If you need a sensitive solution, there is an article giving following
operations to compute number of maps/reduces:
Number of maps : max(min(block_size, data/#maps), min_split_size)
Number of reduces : 0.95*num_nodes*mapred.tasktracker.reduce.tasks.maximum
If you need an "understandable" solution, try the following link:
http://wiki.apache.org/hadoop/HowManyMapsAndReduces
If you need just a rough estimation:
Maps: The right level of parallelism for maps seems to be around
10-100 maps per-node.
There is also another calculation (I find it worse than others), I
couldn't find now. It's something like:
Maps : An even number closest to several times bigger than number
of task trackers.
Reduces: Same as number of task trackers.
Are you using namenode and jobtracker as tasktrackers, too?
If you have difficulties, please inform..
Rasit
2009/2/16 nguyenhuynh :
> Hi all!,
>
>
> I have 3 machines use to run Hadoop/hbase map-reduce. I don't known set
> value for number map tasks and reduces.
>
>
> How many number of task and reduce in this case?
>
>
> Please, help me!
>
> Thanks,
>
>
> Regards,
>
> NguyenHuynh
>
>
>
--
M. Raşit ÖZDAŞ