hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3586) Blocks are not getting replicate even DN's are availble.
Date Tue, 03 Jul 2012 01:26:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405484#comment-13405484
] 

Uma Maheswara Rao G commented on HDFS-3586:
-------------------------------------------

Hi Konstantin,

{code}
I don't see how your suggestion solves the problem, as it doesn't work in case when all replicas
are corrupt, if I understand it correctly.
{code}

When all replicas corrupt, we need not expect it to be replicate, because there is no good
replica to replicate.
But in this case we have 2 good replicas. What I am proposing is, keeping minimum of 3 replicas(replication)
is fine((good + corrupt) (or) (all 3 corrupt) (or) (all 3 good replicas)), more than that
if we get any replica corrupt, let's invalidate it.SO, that even if we have single good replica,
it will be copied to other node in this kind of situation. Ideally having whole cluster filled
with corrupt replica and good replicas is a very rare case and almost no possibility in bigger
clusters. Worth considering for smaller clusters also. Because we have cluster size more than
replication size here, user expectation is replica will be replicated properly.
                
> Blocks are not getting replicate even DN's are availble.
> --------------------------------------------------------
>
>                 Key: HDFS-3586
>                 URL: https://issues.apache.org/jira/browse/HDFS-3586
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, name-node
>    Affects Versions: 2.0.0-alpha, 2.0.1-alpha, 3.0.0
>            Reporter: Brahma Reddy Battula
>         Attachments: HDFS-3586-analysis.txt
>
>
> Scenario:
> =========
> Started four DN's(Say DN1,DN2,DN3 and DN4)
> writing files with RF=3..
> formed pipeline with DN1->DN2->DN3.
> Since DN3 network is very slow.it's not able to send acks.
> Again pipeline is fromed with DN1->DN2->DN4.
> Here DN4 network is also slow.
> So finally commitblocksync happend tp DN1 and DN2 successfully.
> block present in all the four DN's(finalized state in two DN's and rbw state in another
DN's)..
> Here NN is asking replicate to DN3 and DN4,but it's failing since replcia's are already
present in RBW dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message