hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nguyen Manh Tien" <tien.nguyenm...@gmail.com>
Subject Re: Map task failure recovery
Date Fri, 19 Oct 2007 07:46:19 GMT
Owen, Could you show me how to start additional data nodes or tasktrackers ?


2007/10/19, Owen O'Malley <oom@yahoo-inc.com>:
>
> On Oct 18, 2007, at 8:05 PM, Ming Yang wrote:
>
> > In the original MapReduce paper from Google, it mentioned
> > that healthy workers can take over failed task from other
> > workers. Does Hadoop has the same failure recovery strategy?
>
> Yes. If a task fails on one node, it is assigned to another free node
> automatically.
>
> > Also the other question is, in the paper, it seems the nodes can
> > be added/removed while the cluster is running jobs. How does
> > Hadoop achieve this? Since the slave locations are saved in the
> > file and the master doesn't know about new nodes until it
> > restart and reload the slave list.
>
> The slaves file is only used by the startup scripts when bringing up
> the cluster. If additional data nodes or task trackers (ie. slaves)
> are started they automatically join the cluster and will be given
> work. If the servers on one of the slaves are killed, the work will
> be redone on other nodes.
>
> -- Owen
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message