hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2486) Review issues with UnderReplicatedBlocks
Date Thu, 23 Oct 2014 14:17:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181388#comment-14181388
] 

Hudson commented on HDFS-2486:
------------------------------

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1910 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1910/])
Move HDFS-2486 down to 2.7.0 in CHANGES.txt (wang: rev 08457e9e57e4fa3c83217fd0a092e926ba7eb135)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Review issues with UnderReplicatedBlocks
> ----------------------------------------
>
>                 Key: HDFS-2486
>                 URL: https://issues.apache.org/jira/browse/HDFS-2486
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: namenode
>    Affects Versions: 0.23.0
>            Reporter: Steve Loughran
>            Assignee: Uma Maheswara Rao G
>            Priority: Minor
>             Fix For: 2.7.0
>
>         Attachments: HDFS-2486.patch
>
>
> Here are some things I've noted in the UnderReplicatedBlocks class that someone else
should review and consider if the code is correct. If not, they are easy to fix.
> remove(Block block, int priLevel) is not synchronized, and as the inner classes are not,
there is a risk of race conditions there.
> some of the code assumes that getPriority can return the value LEVEL, and if so does
not attempt to queue the blocks. As this return value is not currently possible, those checks
can be removed. 
> The queue gives priority to blocks whose replication count is less than a third of its
expected count over those that are "normally under replicated". While this is good for ensuring
that files scheduled for large replication are replicated fast, it may not be the best strategy
for maintaining data integrity. For that it may be better to give whichever blocks have only
two replicas priority over blocks that may, for example, already have 3 out of 10 copies in
the filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message