hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rasit OZDAS <rasitoz...@gmail.com>
Subject Re: Number map task and reduce?
Date Mon, 16 Feb 2009 07:05:56 GMT
Hi, NguyenHuynh

If you need a sensitive solution, there is an article giving following
operations to compute number of maps/reduces:
Number of maps     :   max(min(block_size, data/#maps), min_split_size)
Number of reduces :   0.95*num_nodes*mapred.tasktracker.reduce.tasks.maximum

If you need an "understandable" solution, try the following link:

If you need just a rough estimation:
Maps: The right level of parallelism for maps seems to be around
10-100 maps per-node.

There is also another calculation (I find it worse than others), I
couldn't find now. It's something like:
Maps      : An even number closest to several times bigger than number
of task trackers.
Reduces: Same as number of task trackers.

Are you using namenode and jobtracker as tasktrackers, too?

If you have difficulties, please inform..

2009/2/16 nguyenhuynh <nguyenhuynh@asnet.com.vn>:
> Hi all!,
> I have 3 machines use to run Hadoop/hbase map-reduce. I don't known set
> value for number map tasks and reduces.
> How many number of task and reduce in this case?
> Please, help me!
> Thanks,
> Regards,
> NguyenHuynh

M. Raşit ÖZDAŞ

View raw message