hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris MacKenzie <stu...@chrismackenziephotography.co.uk>
Subject Re: total number of map tasks
Date Wed, 27 Aug 2014 16:23:17 GMT
It's my understanding that you don't get map tasks as such but containers. 

My experience is with version 2 +

And if that's true containers are based on memory tuning in mapred-site.xml

Otherwise I'd love to learn more. 

Sent from my iPhone

> On 27 Aug 2014, at 12:14, Stijn De Weirdt <stijn.deweirdt@ugent.be> wrote:
> 
> hi all,
> 
> we are tuning yarn (or trying to) on our environment (shared fielsystem, no hdfs) using
terasort and one of the main issue we are seeing is that an avg map task takes < 15sec.
some tuning guides and websites suggest that ideally map tasks run between 40sec to 1 or 2
minutes.
> 
> (however, it's also not very clear if the recommendations are still valid for yarn)
> 
> in particluar, we see way more map tasks then expected, and we are wondering how the
number of map tasks per job run is determined.
> 
> teragen created 64 output files, we are only expecting 64 map tasks, each processing
one input file. however, we see something like 3000 tasks
> 
> 
> hints are much appreciated
> 
> stijn

Mime
View raw message