hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rasit OZDAS <rasitoz...@gmail.com>
Subject Re: Number map task and reduce?
Date Mon, 16 Feb 2009 07:05:56 GMT
Hi, NguyenHuynh

If you need a sensitive solution, there is an article giving following
operations to compute number of maps/reduces:
Number of maps     :   max(min(block_size, data/#maps), min_split_size)
Number of reduces :   0.95*num_nodes*mapred.tasktracker.reduce.tasks.maximum

If you need an "understandable" solution, try the following link:
http://wiki.apache.org/hadoop/HowManyMapsAndReduces

If you need just a rough estimation:
Maps: The right level of parallelism for maps seems to be around
10-100 maps per-node.

There is also another calculation (I find it worse than others), I
couldn't find now. It's something like:
Maps      : An even number closest to several times bigger than number
of task trackers.
Reduces: Same as number of task trackers.

Are you using namenode and jobtracker as tasktrackers, too?

If you have difficulties, please inform..
Rasit

2009/2/16 nguyenhuynh <nguyenhuynh@asnet.com.vn>:
> Hi all!,
>
>
> I have 3 machines use to run Hadoop/hbase map-reduce. I don't known set
> value for number map tasks and reduces.
>
>
> How many number of task and reduce in this case?
>
>
> Please, help me!
>
> Thanks,
>
>
> Regards,
>
> NguyenHuynh
>
>
>



-- 
M. Raşit ÖZDAŞ

Mime
View raw message