hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patai Sangbutsarakum <silvianhad...@gmail.com>
Subject heartbeat and timeout question
Date Wed, 22 May 2013 00:47:56 GMT
Hello Hadoopers,
I am going to migrate production racks of datanodes/tasktrackers into new
core switches. Rack awareness is in place for long time. I am looking for
the way to mitigate recopying blocks of datanodes in the rack that is being
move (when it become dead nodes), and shifting of running tasks in those
tasktrackers to other machines.

One approach, that i can thinking of is playing with heartbeat of both
datanode and tasktracker to make it extra long like 15 minutes, so namenode
and jobtracker are more forgiving to those nodes (that is being moved).
however, network operation that need to be done to flip the switch should
be around couple minutes per rack.

Possible alternatives are more than welcome.

Thanks in advnace,

btw, the cluster is on cdh3u4 (0.20 branch)

View raw message