hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7960) The full block report should prune zombie storages even if they're not empty
Date Fri, 20 Mar 2015 23:11:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372295#comment-14372295

Andrew Wang commented on HDFS-7960:

Reading through it again, few comments:

* there's a TODO: FIXME, we aren't passing in the BlockReportContext. processReport doesn't
need that last parameter anymore either I think, since the information is in the BR context.

* Is there a need for BR ids to be monotonic increasing? Else using a random number seems
better. I see you do a fixup by checking with the previous ID, but with random this shouldn't
be necessary.

* it looks like we only get/set LastBlockReportId in removeZombieStorages. We need to be setting
to the current BR id as BRs come in right? This is probably a holdover from processReport
not being updated from the previous patch rev.

If you wanted to add comments about all this, BlockReportContext's class javadoc would be
a good choice.


    assert (namesystem.hasWriteLock());

space after assert

Going to stop there for now, I think we need to see another rev (the processReport FIXME basically)
to get a feel for BlockReportContext.

> The full block report should prune zombie storages even if they're not empty
> ----------------------------------------------------------------------------
>                 Key: HDFS-7960
>                 URL: https://issues.apache.org/jira/browse/HDFS-7960
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>         Attachments: HDFS-7960.002.patch, HDFS-7960.003.patch, HDFS-7960.004.patch
> The full block report should prune zombie storages even if they're not empty.  We have
seen cases in production where zombie storages have not been pruned subsequent to HDFS-7575.
 This could arise any time the NameNode thinks there is a block in some old storage which
is actually not there.  In this case, the block will not show up in the "new" storage (once
old is renamed to new) and the old storage will linger forever as a zombie, even with the
HDFS-7596 fix applied.  This also happens with datanode hotplug, when a drive is removed.
 In this case, an entire storage (volume) goes away but the blocks do not show up in another
storage on the same datanode.

This message was sent by Atlassian JIRA

View raw message