hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serge Blazhiyevskyy <Serge.Blazhiyevs...@nice.com>
Subject Re: Questions with regard to scheduling of map and reduce tasks
Date Thu, 30 Aug 2012 18:02:59 GMT
The first scenario is expected behavior. And yes you should limit number
of the reducers.


On 8/30/12 10:41 AM, "Vasco Visser" <vasco.visser@gmail.com> wrote:

>When running a job with more reducers than containers available in the
>cluster all reducers get scheduled, leaving no containers available
>for the mappers to be scheduled. The result is starvation and the job
>never finishes. Is this to be considered a bug or is it expected
>behavior? The workaround is to limit the number of reducers to less
>than the number of containers available.
>Also, it seems that from the combined pool of pending map and reduce
>tasks, randomly tasks are picked and scheduled. This causes less than
>optimal behavior. For example, I run a task with 500 mappers and 30
>reducers (my cluster has only 16 machines, two containters per machine
>(duo core machines)). What I observe is that half way through the job
>all reduce tasks are scheduled, leaving only one container for 200+
>map tasks. Again, is this expected behavior? If so, what is the idea
>behind it? And, are the map and reduce task indeed randomly scheduled
>or does it only look like they are?
>Any advice is welcome.

View raw message