hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-1260) 0.20: Block lost when multiple DNs trying to recover it to different genstamps
Date Wed, 25 Aug 2010 20:56:17 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon updated HDFS-1260:

    Attachment: simultaneous-recoveries.txt

After months of running this test I ran into this failure attached above. One of the DNs somehow
ends up with multiple meta files for the same block, but at different generation stamps.

I think the issue is in the implementation of DataNode.updateBlock(). The block passed in
doesn't have a wildcard generation stamp, but we don't care - we go and find the block on
disk without matching generation stamps. I think this is OK based on the validation logic
- we still only move blocks forward in GS-time, and don't revert length. However, when we
then call updateBlockMap() it doesn't use a wildcard generation stamp, so the block can get
left in the block map with the old generation stamp. This inconsistency I think cascades into
the sort of failure seen in the attached log.

I *think* the solution is:
  - Change updateBlock to call updateBlockMap with a wildcard generation stamp key
  - Change the interruption code to use a wildcard GS block when interrupting concurrent writers

I will make these changes and see if the rest of the unit tests still pass, then see if I
can come up with a regression test.

> 0.20: Block lost when multiple DNs trying to recover it to different genstamps
> ------------------------------------------------------------------------------
>                 Key: HDFS-1260
>                 URL: https://issues.apache.org/jira/browse/HDFS-1260
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.20-append
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.20-append
>         Attachments: hdfs-1260.txt, hdfs-1260.txt, simultaneous-recoveries.txt
> Saw this issue on a cluster where some ops people were doing network changes without
shutting down DNs first. So, recovery ended up getting started at multiple different DNs at
the same time, and some race condition occurred that caused a block to get permanently stuck
in recovery mode. What seems to have happened is the following:
> - FSDataset.tryUpdateBlock called with old genstamp 7091, new genstamp 7094, while the
block in the volumeMap (and on filesystem) was genstamp 7093
> - we find the block file and meta file based on block ID only, without comparing gen
> - we rename the meta file to the new genstamp _7094
> - in updateBlockMap, we do comparison in the volumeMap by oldblock *without* wildcard
GS, so it does *not* update volumeMap
> - validateBlockMetaData now fails with "blk_7739687463244048122_7094 does not exist in
blocks map"
> After this point, all future recovery attempts to that node fail in getBlockMetaDataInfo,
since it finds the _7094 gen stamp in getStoredBlock (since the meta file got renamed above)
and then fails since _7094 isn't in volumeMap in validateBlockMetadata
> Making a unit test for this is probably going to be difficult, but doable.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message