hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1079) DFS Scalability: Incremental block reports
Date Thu, 01 May 2008 17:12:56 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593583#action_12593583
] 

Doug Cutting commented on HADOOP-1079:
--------------------------------------

>From http://www.nabble.com/-tt16976556.html

Long ago we talked of implementing partial, incremental block reports.
We'd divide blockid space into 64 sections.  The datanode would ask the
namenode for the hash of its block ids in a section.  Full block lists
would then only be sent when the hash differs.  Both sides would
maintain hashes of all sections in memory.  Then, instead of making a
block report every hour, we'd make a 1/64 block id check every minute.


> DFS Scalability: Incremental block reports
> ------------------------------------------
>
>                 Key: HADOOP-1079
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1079
>             Project: Hadoop Core
>          Issue Type: Sub-task
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: blockReportPeriod.patch
>
>
> I have a cluster that has 1800 datanodes. Each datanode has around 50000 blocks and sends
a block report to the namenode once every hour. This means that the namenode processes a block
report once every 2 seconds. Each block report contains all blocks that the datanode currently
hosts. This makes the namenode compare a huge number of blocks that practically remains the
same between two consecutive reports. This wastes CPU on the namenode.
> The problem becomes worse when the number of datanodes increases.
> One proposal is to make succeeding block reports (after a successful send of a full block
report) be incremental. This will make the namenode process only those blocks that were added/deleted
in the last period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message