hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-79) listFiles optimization
Date Mon, 13 Mar 2006 23:54:03 GMT
listFiles optimization
----------------------

         Key: HADOOP-79
         URL: http://issues.apache.org/jira/browse/HADOOP-79
     Project: Hadoop
        Type: Improvement
  Components: dfs  
    Reporter: Konstantin Shvachko


In FSDirectory.getListing() looking at line
listing[i] = new DFSFileInfo(curName, cur.computeFileLength(), cur.computeContentsLength(),
isDir(curName));

1. computeContentsLength() is actually calling computeFileLength(), so this is called twice,
meaning that file length is calculated twice.
2. isDir() is looking for the INode (starting from the rootDir) that has actually been obtained
just two lines above, note that the tree is locked by that time.

I propose a simple optimization for this, see attachment.

3. A related question: Why DFSFileInfo needs 2 separate fields len for file length and
contentsLen for directory contents size? It looks like these fields are mutually exclusive,
and we can use just one, interpreting it one way or another with respect to the value of isDir.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message