hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-8486) DN startup may cause severe data loss
Date Wed, 27 May 2015 16:47:17 GMT
Daryn Sharp created HDFS-8486:

             Summary: DN startup may cause severe data loss
                 Key: HDFS-8486
                 URL: https://issues.apache.org/jira/browse/HDFS-8486
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
    Affects Versions: 2.0.0-alpha, 0.23.1
            Reporter: Daryn Sharp
            Assignee: Daryn Sharp
            Priority: Blocker

A race condition between block pool initialization and the directory scanner may cause a mass
deletion of blocks in multiple storages.

If block pool initialization finds a block on disk that is already in the replica map, it
deletes one of the blocks based on size, GS, etc.  Unfortunately it _always_ deletes one of
the blocks even if identical, thus the replica map _must_ be empty when the pool is initialized.

The directory scanner starts at a random time within its periodic interval (default 6h). 
If the scanner starts very early it races to populate the replica map, causing the block pool
init to erroneously delete blocks.

This message was sent by Atlassian JIRA

View raw message