hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang" <hair...@yahoo-inc.com>
Subject RE: dfs datanode heartbeats and getBlockwork requests
Date Wed, 05 Apr 2006 00:08:55 GMT
I think it is better to implement the start-up delay at the namenode. But
the key is that the name node should be able to tell if it is in a steady
state or not either at start-up time or at runtime after a network
disruption. It should not instruct datanodes to replicate or delete any
blocks before it has reached a steady state.


-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org] 
Sent: Tuesday, April 04, 2006 9:58 AM
To: hadoop-dev@lucene.apache.org
Subject: Re: dfs datanode heartbeats and getBlockwork requests

Eric Baldeschwieler wrote:
> If we moved to a scheme where the name node was just given a small 
> number of blocks with each heartbeat, there would be no reason to not 
> start reporting blocks immediately, would there?

There would still be a small storm of un-needed replications on startup. 
  Say it takes a minute at startup for all data nodes to report their
complete block lists to the name node.  If heartbeats are every 3 seconds,
then all but the last data node to report in would be handed 20 small lists
of blocks to start replicating.  And the switches could be saturated doing a
lot of un-needed transfers, which would slow startup. 
  Then, for the next minute after startup, the nodes would be told to delete
blocks that are now over-replicated.  We'd like startup to be as fast and
painless as possible.  Waiting a bit before checking to see if blocks are
over- or under-replicated seems a good way.


View raw message