hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14531) Datanode's ScanInfo requires excessive memory
Date Tue, 11 Jun 2019 16:10:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861176#comment-16861176

Todd Lipcon commented on HDFS-14531:

It seems the DirectoryScanner serves two purposes: (1) detect blocks in memory that are missing
from disk, and (2) detect blocks on disk that are missing from memory. In most cases, the
first of these two purposes is much more important, since it protects against file system
bugs/corruptions or naughty administrators accidentally removing files underneath the datanode.
Accidental addition of block files is far less likely, and a purposeful addition (eg during
some manual restore procedure) can always restart the DN to pick them up.

Given that, I think we could keep the functionality but reduce the memory usage by a few orders
of magnitude using bloom filters:
- when scanning the disk, instead of getting a ScanInfo for each block, instead populate a
bloom filter. For the example here of 1M replicas, you can get a blooom filter with 0.1% FP
rate with only 1.7MB of RAM.
- when reconciling, check each in-memory block to see if it's present in the bloom filter.
If the bloom filter says "not present", you know you have a missing block. If it says "present",
you have a 0.1% chance of not detecting a missing block

In order to guarantee that we eventually detect missing blocks, each pass of this algorithm
can use a different hash seed for the bloom filter. This ends up reducing the FP rate after
N passes to FP^N (eg after two passes of 0.1% FP rate, the FP rate is 0.0001%). So, within
a few passes, the probability of undetected corruption will shrink to a smaller value than
many other undetectable errors on HDFS (eg after four passes, of FP=10^-3, the probability
of missing a block would be less than the probability of an undetected 32-bit checksum error)

Of course this doesn't solve the IO issue that Nathan mentioned, but that could be addressed
by throttling.

> Datanode's ScanInfo requires excessive memory
> ---------------------------------------------
>                 Key: HDFS-14531
>                 URL: https://issues.apache.org/jira/browse/HDFS-14531
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Priority: Major
>         Attachments: Screen Shot 2019-05-31 at 12.25.54 PM.png
> The DirectoryScanner's ScanInfo map consumes ~4.5X memory as replicas as the replica
map.  For 1.1M replicas: the replica map is ~91M while the scan info is ~405M.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message