hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahmood Naderan <nt_mahm...@yahoo.com>
Subject The correct way to find number of mappers and reducers
Date Sat, 18 Apr 2015 05:31:03 GMT
Hi,

There are good guides on the number of mappers and reducers in a hadoop job. For example:

Running Hadoop on Ubuntu Linux (Single-Node Cluster)    http://goo.gl/kaA1h5
Partitioning your job into maps and reduces     http://goo.gl/tpU23

However, there are some, say noob, question here. Assume:

A. There are 32 cores on the machine
B. The hadoop is setup on a single machine
C. There are more than 100 files in the HDFS, each is 67MB.

Now the questions are

1) How can I determine the DFS block size? it is stated that "The number of maps is usually
driven by the number of DFS blocks in the input files"
2) What is the default value of io.file.buffer.size? I haven't set that...
3) Where should I exactly add those options, e.g number of mappers and reducers.


 
Regards,
Mahmood

Mime
View raw message