hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-4540) An invalidated block should be removed from the blockMap
Date Fri, 31 Oct 2008 18:41:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644391#action_12644391
] 

rangadi edited comment on HADOOP-4540 at 10/31/08 11:40 AM:
-----------------------------------------------------------------

(Edit : corrected the jira number refererred.)

I think this was the policy  the case even in pre-0.17.0 NameNode i.e. Blocks were deleted
only lazily from blocksMap. Whether HADOOP-4556 has always been there or made more probably
by another policy I am not sure.

bq. My proposal is to remove a replica from the blocks map when it is marked as "invalid"
(i.e., when it is moved to the recentInvalidateSet) as a result of over-replication. Also
when a block report comes in, and a new replica is found but it is marked as invalid, this
new replica does not get added to the blocks map.

This probably needs more details.

We have so many maps : blocksMap, neededReplications, excessReplications etc. These are all
supposed to be consistent in some way. What the consistency requirements are or how the requirements
are enforced in not explicitly defined anywhere. I am afraid if we make one isolated change
now, it is very hard say for sure that we are not introducing issues similar to HADOOP-4556.


We could probably do something smaller to avoid HADOOP-4556. But to change a policy that been
there since the beginning as this jira proposes, I think we need to consider more. I propose
we write down what are the maps involved and their relations (when and why a block moves to
and from these maps etc).



      was (Author: rangadi):
    I think this was the policy  the case even in pre-0.17.0 NameNode i.e. Blocks were deleted
only lazily from blocksMap. Whether HADOOP-4477 has always been there or made more probably
by another policy I am not sure.

bq. My proposal is to remove a replica from the blocks map when it is marked as "invalid"
(i.e., when it is moved to the recentInvalidateSet) as a result of over-replication. Also
when a block report comes in, and a new replica is found but it is marked as invalid, this
new replica does not get added to the blocks map.

This probably needs more details.

We have so many maps : blocksMap, neededReplications, excessReplications etc. These are all
supposed to be consistent in some way. What the consistency requirements are or how the requirements
are enforced in not explicitly defined anywhere. I am afraid if we make one isolated change
now, it is very hard say for sure that we are not introducing issues similar to HADOOP-4477.


We could probably do something smaller to avoid HADOOP-4477. But to change a policy that been
there since the beginning as this jira proposes, I think we need to consider more. I propose
we write down what are the maps involved and their relations (when and why a block moves to
and from these maps etc).


  
> An invalidated block should be removed from the blockMap
> --------------------------------------------------------
>
>                 Key: HADOOP-4540
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4540
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Currently when a namenode schedules to delete an over-replicated block, the replica to
be deleted does not get removed the block map immediately. Instead it gets removed when the
next block report to comes in. This causes three problems: 
> 1. getBlockLocations may return locations that do not contain the block;
> 2. Over-replication due to unsuccessful deletion can not be detected as described in
HADOOP-4477.
> 3. The number of blocks shown on dfs Web UI does not get updated on a source node when
a large number of blocks have been moved from the source node to a target node, for example,
when running a balancer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message