hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-6889) Provide an iterator-based listing API for FileSystem
Date Wed, 20 Aug 2014 18:02:28 GMT
Kihwal Lee created HDFS-6889:

             Summary: Provide an iterator-based listing API for FileSystem
                 Key: HDFS-6889
                 URL: https://issues.apache.org/jira/browse/HDFS-6889
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Kihwal Lee

Iterator based listing methods already exist in {{FileContext}} for both simple listing and
listing with locations. However, {{FileSystem}} lacks the former.  From what I understand,
it wasn't added to {{FileSystem}} because it was believed to be phased out soon. Since {{FileSystem}}
is very well alive today and new features are getting added frequently, I propose adding an
iterator based {{listStatus}} method. As for the name of the new method, we can use the same
name used in {{FileContext}} : {{listStatusIterator()}}.

It will be particularly useful when listing giant directories. Without this, the client has
to build up a huge data structure and hold it in memory. We've seen client JVMs running out
of memory because of this.

Once this change is made, we can modify FsShell, etc. in followup jiras.

This message was sent by Atlassian JIRA

View raw message