hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yoram Arnon" <yar...@yahoo-inc.com>
Subject RE: dfs datanode heartbeats and getBlockwork requests
Date Wed, 05 Apr 2006 16:30:07 GMT
The name node, on startup, should know which data nodes are expected to be
there, and not make replication decisions before he knows who's actually
there and who's not.
A crude way to achieve that is by just waiting for a while, hoping that all
the data nodes connect.
A more refined way would be to compare who connected to who is expected to
connect. It enables faster startup when everyone just connects quickly, and
better robustness when some data nodes are slow to connect, or when the name
node is slow to process the barrage of connections.
The rule could be "no replications until X% of the expected nodes have
connected, AND there are no pending unprocessed connection messages". X
should be on the order of 90, perhaps less for very small clusters.


-----Original Message-----
From: Hairong Kuang [mailto:hairong@yahoo-inc.com] 
Sent: Tuesday, April 04, 2006 5:09 PM
To: hadoop-dev@lucene.apache.org
Subject: RE: dfs datanode heartbeats and getBlockwork requests

I think it is better to implement the start-up delay at the namenode. But
the key is that the name node should be able to tell if it is in a steady
state or not either at start-up time or at runtime after a network
disruption. It should not instruct datanodes to replicate or delete any
blocks before it has reached a steady state.


-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org]
Sent: Tuesday, April 04, 2006 9:58 AM
To: hadoop-dev@lucene.apache.org
Subject: Re: dfs datanode heartbeats and getBlockwork requests

Eric Baldeschwieler wrote:
> If we moved to a scheme where the name node was just given a small 
> number of blocks with each heartbeat, there would be no reason to not 
> start reporting blocks immediately, would there?

There would still be a small storm of un-needed replications on startup. 
  Say it takes a minute at startup for all data nodes to report their
complete block lists to the name node.  If heartbeats are every 3 seconds,
then all but the last data node to report in would be handed 20 small lists
of blocks to start replicating.  And the switches could be saturated doing a
lot of un-needed transfers, which would slow startup. 
  Then, for the next minute after startup, the nodes would be told to delete
blocks that are now over-replicated.  We'd like startup to be as fast and
painless as possible.  Waiting a bit before checking to see if blocks are
over- or under-replicated seems a good way.


View raw message