hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Hammerman <jhammer...@videoegg.com>
Subject Question regarding dwontime model for DataNodes
Date Tue, 26 May 2009 22:08:40 GMT
Hello Hadoop Users list:

                We are running Hadoop version 0.18.2. My team lead has asked me to investigate
the answer to a particular question regarding Hadoop's handling of offline DataNodes - specifically,
we would like to know how long a node can be offline before it is totally rebuilt when it
has been readded to the cluster.
                From what I've been able to determine from the documentation it appears to
me that the NameNode will simply begin scheduling block replication on its remaining cluster
members. If the offline node comes back online, and it reports all its blocks as being uncorrupted,
then the NameNode just cleans up the "extra" blocks.
                In other words, there is no explicit handling based on the length of the outage
- the behavior of the cluster will depend entirely on the outage duration.

                Anyone care to shed some light on this?

                Joseph Hammerman

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message