hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Haijun Cao <haijun...@ymail.com>
Subject do NOT start reduce task until all mappers are finished
Date Mon, 24 Nov 2008 23:20:33 GMT

I am using 0.18.2 with fair scheduler hadoop-3476. 

The purpose of fair scheduler is to prevent long running jobs
from blocking short jobs. I gave it a try --- start a long job first, then a
short one. The short job is able to grab some map slot and finishes its map
phase quickly, but it still blocks on reduce phase. Because the long job has
taken all the reduce slots (because the long job starts first and its reducers
are started shortly after).
The long job’s reducer won’t finish until all its mappers
have finished. So my short job is still blocked by the long job…. Making the
fair scheduler useless for my workload.
I am wondering if there is a way to NOT to start reduce task
until all its mappers have finished. 

Haijun Cao

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message