hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stijn De Weirdt <stijn.dewei...@ugent.be>
Subject total number of map tasks
Date Wed, 27 Aug 2014 11:14:43 GMT
hi all,

we are tuning yarn (or trying to) on our environment (shared fielsystem, 
no hdfs) using terasort and one of the main issue we are seeing is that 
an avg map task takes < 15sec. some tuning guides and websites suggest 
that ideally map tasks run between 40sec to 1 or 2 minutes.

(however, it's also not very clear if the recommendations are still 
valid for yarn)

in particluar, we see way more map tasks then expected, and we are 
wondering how the number of map tasks per job run is determined.

teragen created 64 output files, we are only expecting 64 map tasks, 
each processing one input file. however, we see something like 3000 tasks

hints are much appreciated


View raw message