hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-1658) A less expensive way to figure out directory size
Date Fri, 25 Feb 2011 22:59:21 GMT
A less expensive way to figure out directory size

                 Key: HDFS-1658
                 URL: https://issues.apache.org/jira/browse/HDFS-1658
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Hairong Kuang
            Assignee: Hairong Kuang

Currently in order to figure out a directory size, we have to list a directory by calling
RPC getListing and counts its child size. This is an expensive operation if a directory is

On the other hand when fetching the status of a path (i.e. calling RPC getFileInfo), the length
field of FileStatus is set to be 0 if the path is a directory.

I am thinking to change this field (FileStatus#length) to be the directory size when the path
is a directory. So we can call getFileInfo to get the directory size. This call is much less
expensive and simpler than getListing.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message