hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-713) dfs list operation is too expensive
Date Mon, 13 Nov 2006 23:25:38 GMT
dfs list operation is too expensive

                 Key: HADOOP-713
                 URL: http://issues.apache.org/jira/browse/HADOOP-713
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
    Affects Versions: 0.8.0
            Reporter: Hairong Kuang

A list request to dfs returns an array of DFSFileInfo. A DFSFileInfo of a directory contains
a field called contentsLen, indicating its size  which gets computed at the namenode side
by resursively going through its subdirs. At the same time, the whole dfs directory tree is

The list operation is used a lot by DFSClient for listing a directory, getting a file's size
and # of replicas, and getting the size of dfs. Only the last operation needs the field contentsLen
to be computed.

To reduce its cost, we can add a flag to the list request. ContentsLen is computed If the
flag is set. By default, the flag is false.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message