Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-issues@hadoop.apache.org
Date: Sat, 31 May 2014 16:46:02 +0000 (UTC)
From: "Steve Loughran (JIRA)" <jira@apache.org>
To: common-issues@hadoop.apache.org
Message-ID: <JIRA.12717094.1401299350527.53862.1401554762495@arcas>
In-Reply-To: <JIRA.12717094.1401299350527@arcas>
References: <JIRA.12717094.1401299350527@arcas>
Subject: [jira] [Commented] (HADOOP-10634) Add recursive list apis to
 FileSystem to give implementations an opportunity for optimization
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014708#comment-14014708 ] 

Steve Loughran commented on HADOOP-10634:
-----------------------------------------

I can see how this benefits object stores, where a new HTTP get is needed per directory/4K files, so it would make the operation O(1) rather than O(directories).

But even today's per-dir operation cause memory problems server side and client side in filesytems with many thousands of files in a directory, especially files with long filenames -in filesystems like HDFS where the cost of remote op over an open channel is low, the cost of building multi-MB payloads in a busy namenode high. And then the client needs to store that information too, pass it around the stack etc. -it's expensive.

Which is why Hadoop 2 added a new operation, {{  public RemoteIterator<LocatedFileStatus> listFiles(
      final Path f, final boolean recursive)}} to do recursive queries. It's designed to be lower cost in the FS client and server, at the price of more exposure to concurrent operations..

I think you should be able to implement your recursive fetch in that operation -the client does the bulk retrieve (or even do it 4K-names at a time), and deliver it to a client one-by-one. 

If you can do that -and then add the mr operations to work on it, then you'd benefit all the filesystems, not come up against resistance to a memory-intensive operation bad for full filesystems, and have something that should work really well with S3. 

Note also that we hope to freeze all but maintenance on s3n and switch s3a (HADOOP-10400) in its own jar (HADOOP-10373). Making changes as that goes in would help with testing that code, as well as justifying users switching to it

-steve


> Add recursive list apis to FileSystem to give implementations an opportunity for optimization
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-10634
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10634
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 2.4.0
>            Reporter: Sumit Kumar
>         Attachments: HADOOP-10634.patch
>
>
> Currently different code flows in hadoop use recursive listing to discover files/folders in a given path. For example in FileInputFormat (both mapreduce and mapred implementations) this is done while calculating splits. They however do this by doing listing level by level. That means to discover files in /foo/bar means they do listing at /foo/bar first to get the immediate children, then make the same call on all immediate children for /foo/bar to discover their immediate children and so on. This doesn't scale well for fs implementations like s3 because every listStatus call ends up being a webservice call to s3. In cases where large number of files are considered for input, this makes getSplits() call slow. 
> This patch adds a new set of recursive list apis that give opportunity to the s3 fs implementation to optimize. The behavior remains the same for other implementations (that is a default implementation is provided for other fs so they don't have to implement anything new). However for s3 it provides a simple change (as shown in the patch) to improve listing performance.


--
This message was sent by Atlassian JIRA
(v6.2#6252)