hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lukavsk√Ĺ <jan.lukav...@firma.seznam.cz>
Subject Running map tasks after all reduces have finished
Date Thu, 23 Aug 2012 09:25:01 GMT
Hi all,

we are seeing strange behaviour of JobTracker in the following scenario:
  - job finishes map phase and starts reduce
  - after the shuffle phase of all reducers we loose a tasktracker, that 
doesn't run any reducer - so all remaining reducers are still running in 
the reduce phase
  - map tasks that were running on the lost tasktracker are rescheduled
  - reduces may finish earlier than the rescheduled map tasks and so the 
job is blocked waiting for the maps to finish, although their output is 
simple discarded

Is this behaviour a bug or feature? :) I haven't found any JIRA that 
would describe it, if there exists one can anyone point me out?


View raw message