hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-150) Replication should be decoupled from heartbeat
Date Thu, 17 Jul 2014 18:39:04 GMT

     [ https://issues.apache.org/jira/browse/HDFS-150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Allen Wittenauer resolved HDFS-150.

    Resolution: Fixed

Closing as fixed then!

> Replication should be decoupled from heartbeat
> ----------------------------------------------
>                 Key: HDFS-150
>                 URL: https://issues.apache.org/jira/browse/HDFS-150
>             Project: Hadoop HDFS
>          Issue Type: Bug
>         Environment: Hadoop 80 node cluster
>            Reporter: Srikanth Kakani
> I did a simple experiment of shooting down one node in the cluster and measure the time
taken to replicate the under-replicated blocks.
> ~30000 blocks were under replicated == ~400 / node  should take 200 minutes to replicate
completely given 1 minute heartbeat interval.
> My findings: it took around 220 minutes, which is reasonable.
> Bug: Replication is coupled with heartbeat. Heartbeat interval is based on how much a
namenode can handle. Repliaction should be based on how much a datanode can handle.
> So given the default heartbeat interval of 20 seconds, we computed datanodes can handle
2 replications in that interval based on which Namenodes give 2 blocks per heartbeat to replicate.
> What we propose is to keep the 20second/2blocks constant and hence a datanode coming
in with a heartbeat of 1 minute interval should be given 6 blocks to replicate per heartbeat.
In this case instead on taking 200 minutes it should take 200/3 ~1 hour to replicate the entire

This message was sent by Atlassian JIRA

View raw message