hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Boudnik (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-782) dynamic replication
Date Sat, 21 Nov 2009 02:21:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780891#action_12780891

Konstantin Boudnik commented on HDFS-782:

>From the top of my head I can see two issues with such an idea (I think I'll corrected
if it is all nonsense :-)
- introducing very different replication triggering mechanism which, in turn, will still have
to communicate with NN to update number of replicas for some blocks, etc.
- it sounds like that just some of the files will have higher replication factor
- because a block replication eats up a certain amount of network bandwidth it might be used
for a 'replication storm' (possibly malicious) when a few blocks are being requested too often
thus triggering cascade replication. Which might be of course detected and stopped, but sounds
like a very complex addition to me.

> dynamic replication
> -------------------
>                 Key: HDFS-782
>                 URL: https://issues.apache.org/jira/browse/HDFS-782
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Ning Zhang
> In a large and busy cluster, a block can be requested by many clients at the same time.
HDFS-767 tries to solve the failing case when the # of retries exceeds the maximum # of retries.
However, that patch doesn't solve the performance issue since all failing clients have to
wait a certain period before retry, and the # of retries could be high. 
> One solution to solve the performance issue is to increase the # of replicas for this
"hot" block dynamically when it is requested many times at a short period. The name node need
to be aware such situation and only clean up extra replicas when they are not accessed recently.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message