hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-923) DFS Scalability: datanode heartbeat timeouts cause cascading timeouts of other datanodes
Date Mon, 12 Feb 2007 21:37:05 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

dhruba borthakur updated HADOOP-923:
------------------------------------

    Attachment: pendingTransferThread2.patch

Incorporated review comments. the change from the previous version is in method FSNamesystem.computeDatanodeWork().

> DFS Scalability: datanode heartbeat timeouts cause cascading timeouts of other datanodes
> ----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-923
>                 URL: https://issues.apache.org/jira/browse/HADOOP-923
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: dhruba borthakur
>         Assigned To: dhruba borthakur
>         Attachments: pendingTransferThread2.patch
>
>
> The datanode sends a heartbeat to the namenode every 3 seconds. The namenode processes
the heartbeat and sends  a list of block-to-be-replicated and blocks-to-be-deleted as part
of the heartbeat response.
> At times when a couple of datanodes fail, the heartbeat processing on the namenode becomes
pretty heavyweight. It acquires the global FSNamesystem lock, traverses the neededReplication
structure, generates a list of blocks to be replicated and responds to the heartbeat message.
Determining the list of blocks-to-be-replciated is pretty heavyweight, takes plenty of CPU
and blocks processing of other heartbeats because of the global FSNamesystem lock.
> It would improve scalability a lot if heartbeat processing does not require the FSNamesystem
lock. In fact, the pre-existing "heartbeat" lock already exists for this purpose. 
> I propose that the Heartbeat message be separate from the "retrieve blocks-to-replicate
and blocks-to-delete" messages. The datanode can continue to heartbeat once every 3 seconds
while it can afford to "retrieve blocks-to-replicate" at a much coarser interval. Heartbeat
processing on the namenode will be fast because it does not require the global FSNamesystem
lock. Moreover, a datanode failure will not aggrevate the heartbeat processing time on the
namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message