hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yi Liu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-8863) The remiaing space check in BlockPlacementPolicyDefault is flawed
Date Wed, 12 Aug 2015 03:50:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692870#comment-14692870
] 

Yi Liu edited comment on HDFS-8863 at 8/12/15 3:50 AM:
-------------------------------------------------------

{quote}
What if we let it check against storage type level sum and also make sure there is at least
one storage with enough space?
{quote}
Still have potential issue.  For example, we have datanode dn0, and three storages(s1, s2,
s3) of required storage type. Both s1 and s3 has 2/3 block size remaining space, and s2 has
1+2/3 block size remaining space. We just scheduled one block on dn0, it's certainly on s2,
now a new block is adding and block placement checks dn0, for current patch, it will see the
maximum of remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat
it as a good target, but actually it's not.

I am thinking we can do as following:  do storage type level sum, but for each storage, we
only count the remaining space of multiple block size part, so for above example, remaining
space of s1 and s3 is counted 0, s2 is 1, then the sum is 1, dn0 is not a good target.  In
this approach, we don't need to check the maximum too.

{quote}
Datanodes only care about the storage type, so checking a particular storagewon't do any good.
It will just cause block placement to re-pick target more.
{quote}
You are right, I also had another meaning: when iterating storages, it's to check the remaining
space of storage type, but actually some back storages may be {{State.FAILED}} or {{State.READ_ONLY_SHARED}},
it's remaining space is still be counted, right?  So I think you can do these check in {{getRemaining}}.
 See my JIRA HDFS-8884, which has relation to this, I do fast-fail check for datanode, of
course, I can do this part in my JIRA if you don't do it here.


was (Author: hitliuyi):
{quote}
What if we let it check against storage type level sum and also make sure there is at least
one storage with enough space?
{quote}
Still have potential issue.  For example, we have datanode dn0, and three storages(s1, s2,
s3) of required storage type. Both s1 and s3 has 2/3 block size remaining space, and s2 has
1+2/3 block size remaining space. We just scheduled one block on dn0, it's certainly on s2,
now a new block is adding and block placement checks dn0, for current patch, it will see the
maximum of remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat
it as a good target, but actually it's not.

I am thinking we can do as following:  do storage type level sum, but for each storage, we
only count the remaining space of multiple block size part, so for above example, remaining
space of s1 and s3 is counted 0, s2 is 1, then the sum is 1, dn0 is not a good target.  In
this approach, we don't need to check the maximum too.

{quote}
Datanodes only care about the storage type, so checking a particular storagewon't do any good.
It will just cause block placement to re-pick target more.
{quote}
You are right, I also had another meaning: when iterating storages, it's to check the remaining
space of storage type, but actually some back storages may be {{State.FAILED}} or {{State.READ_ONLY_SHARED}},
it's remaining space is still be counted, right?  So I think you can do these check in {{getRemaining}}.
 See my JIRA HDFS-8884, which has relation to this, I do fast-fail check for datanode, of
cause, I can do this part in my JIRA if you don't do it here.

> The remiaing space check in BlockPlacementPolicyDefault is flawed
> -----------------------------------------------------------------
>
>                 Key: HDFS-8863
>                 URL: https://issues.apache.org/jira/browse/HDFS-8863
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>              Labels: 2.6.1-candidate
>         Attachments: HDFS-8863.patch, HDFS-8863.v2.patch
>
>
> The block placement policy calls {{DatanodeDescriptor#getRemaining(StorageType}}}} to
check whether the block is going to fit. Since the method is adding up all remaining spaces,
namenode can allocate a new block on a full node. This causes pipeline construction failure
and {{abandonBlock}}. If the cluster is nearly full, the client might hit this multiple times
and the write can fail permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message