hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: HDFS replica management
Date Tue, 17 Jul 2007 18:49:57 GMT
Phantom wrote:
> I am sure re-replication is not done on every heartbeat miss since that
> would be very expensive and inefficient. At the same time you cannot really
> tell if a node is partitioned away, crashed or just slow. Is it threshold
> based i.e I missed N heartbeats so re-replicate ?

Yes, detection of datanode failure is threshold-based.  It is currently 
ten minutes plus ten missed heartbeats.

> Which package in the
> source code could I look at to glean this information ?

This is in dfs/FSNameSystem.java.

Doug

Mime
View raw message