Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 85E75D2F5 for ; Thu, 7 Mar 2013 07:00:20 +0000 (UTC) Received: (qmail 70756 invoked by uid 500); 7 Mar 2013 07:00:20 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 70351 invoked by uid 500); 7 Mar 2013 07:00:19 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 70292 invoked by uid 99); 7 Mar 2013 07:00:18 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Mar 2013 07:00:18 +0000 Date: Thu, 7 Mar 2013 07:00:18 +0000 (UTC) From: "Andrew Wang (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-9377) FTPFileSystem.listStatus() runs very slow, due to inappropriate call of filePath.makeQualified MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595619#comment-13595619 ] Andrew Wang commented on HADOOP-9377: ------------------------------------- I looked into this a bit, the slowness is probably because of the call to {{fs#getWorkingDirectory}} inside makeQualified: {code} @Deprecated public Path makeQualified(FileSystem fs) { return makeQualified(fs.getUri(), fs.getWorkingDirectory()); } {code} Looking in {{FTPFileSystem#getWorkingDirectory}}, this is kind of a bogus call, since it returns the user's home directory, which involves connecting to the FTP server (slow!). So, unless people are depending on their relative paths being qualified with their home dir, the behavior in the posted patch should be fine. > FTPFileSystem.listStatus() runs very slow, due to inappropriate call of filePath.makeQualified > ---------------------------------------------------------------------------------------------- > > Key: HADOOP-9377 > URL: https://issues.apache.org/jira/browse/HADOOP-9377 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Affects Versions: 2.0.3-alpha > Reporter: James Yu > Attachments: HADOOP-9377.diff > > > FTPFileSystem.listStatus() calls > getFileStatus(ftpFiles[i], absolute) calls > new FileStatus(....) calls > filePath.makeQualified(...) calls > fs.getWorkingDirectory() calls > getHomeDirectory() > which creates new FTP connection every time, to get the workdingDirectory. this caused the FTPFileSystem.listStatus() takes long time to run (on average 3-6 seconds per file in my test). > I attach a suggestion of fix in FTPFileSystem.java, only 4 lines of change. after the fix, there's no slowness issue anymore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira