hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-1446) Refactor the start-time Directory Tree and Replicas Map constructors to share data and run volume-parallel
Date Thu, 07 Oct 2010 22:41:32 GMT
Refactor the start-time Directory Tree and Replicas Map constructors to share data and run
volume-parallel
----------------------------------------------------------------------------------------------------------

                 Key: HDFS-1446
                 URL: https://issues.apache.org/jira/browse/HDFS-1446
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: data-node
    Affects Versions: 0.20.2
            Reporter: Matt Foley
            Assignee: Matt Foley
             Fix For: 0.22.0


Refactor the FSDir() and getVolumeMap() call chains in FSDataset, so they share data and run
volume-parallel. Currently the two constructors for in-memory directory tree and replicas
map run THREE full scans of the entire disk - once in FSDir(), once in recoverTempUnlinkedBlock(),
and once in addToReplicasMap(). During each scan, a new File object is created for each of
the 100,000 or so items in the native file system (for a 50,000-block node). This impacts
GC as well as disk traffic.

This work item is one of four sub-tasks for HDFS-1443, Improve Datanode startup time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message