hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rong-en Fan" <gra...@gmail.com>
Subject slow copy makes reduce hang
Date Thu, 18 Sep 2008 19:34:27 GMT

I'm using and see a reduce hang in shuffle phase due
to a unresponsive node. From the reduce log (sorry that I didn't
keep it around), it stuck in copying map output from a dead
node (I can not ssh to that one). At that point, all maps are already
finished. I'm wondering why this slowness does not trigger a reduce
task fail and the corresponding map failed (even if it is finished) then
redo the map task on  another node so that the reduce can work.

Rong-En Fan

View raw message