hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virajith Jalaparti (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-12777) [READ] Reduce memory and CPU footprint for PROVIDED volumes.
Date Thu, 09 Nov 2017 20:20:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Virajith Jalaparti updated HDFS-12777:
    Status: Patch Available  (was: Open)

> [READ] Reduce memory and CPU footprint for PROVIDED volumes.
> ------------------------------------------------------------
>                 Key: HDFS-12777
>                 URL: https://issues.apache.org/jira/browse/HDFS-12777
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Virajith Jalaparti
>            Assignee: Virajith Jalaparti
>         Attachments: HDFS-12777-HDFS-9806.001.patch, HDFS-12777-HDFS-9806.002.patch,
> As opposed to local blocks, each DN keeps track of all blocks in PROVIDED storage. This
can be millions of blocks for 100s of TBs of PROVIDED data. Storing the data for these blocks
can lead to a large memory footprint. Further, with so many blocks, {{DirectoryScanner}} running
on a PROVIDED volume can increase the memory and CPU utilization. 
> To reduce these overheads, this JIRA aims to (a) disable the {{DirectoryScanner}} on
PROVIDED volumes (as HDFS-9806 focuses on only read-only data in PROVIDED volumes), (b) reduce
the space occupied by {{FinalizedProvidedReplicaInfo}} by using a common URI prefix across
all PROVIDED blocks.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message