hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6101) TestReplaceDatanodeOnFailure fails occasionally
Date Sat, 21 Nov 2015 02:41:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020188#comment-15020188
] 

Walter Su commented on HDFS-6101:
---------------------------------

bq.  When there are 10 writers begin to writer at the same time, the policy will not allow
some writers set up pipelines with 3 data nodes, due to the load factor of data nodes. 
It happens because in test we only start few DNs and write a lot files. In production It won't
be a problem. I saw nodes be excluded by placement policy many times when I write tests for
erasue-coded files, which writes to 9 DNs concurrently only for one file.

So could you try
{{conf.setBoolean(DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, false);}}
and don't reduce the writers?


> TestReplaceDatanodeOnFailure fails occasionally
> -----------------------------------------------
>
>                 Key: HDFS-6101
>                 URL: https://issues.apache.org/jira/browse/HDFS-6101
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Arpit Agarwal
>            Assignee: Wei-Chiu Chuang
>         Attachments: HDFS-6101.001.patch, HDFS-6101.002.patch, HDFS-6101.003.patch, HDFS-6101.004.patch,
HDFS-6101.005.patch, TestReplaceDatanodeOnFailure.log
>
>
> Exception details in a comment below.
> The failure repros on both OS X and Linux if I run the test ~10 times in a loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message