hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9377) FTPFileSystem.listStatus() runs very slow, due to inappropriate call of filePath.makeQualified
Date Thu, 07 Mar 2013 07:00:18 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595619#comment-13595619

Andrew Wang commented on HADOOP-9377:

I looked into this a bit, the slowness is probably because of the call to {{fs#getWorkingDirectory}}
inside makeQualified:

  public Path makeQualified(FileSystem fs) {
    return makeQualified(fs.getUri(), fs.getWorkingDirectory());

Looking in {{FTPFileSystem#getWorkingDirectory}}, this is kind of a bogus call, since it returns
the user's home directory, which involves connecting to the FTP server (slow!). So, unless
people are depending on their relative paths being qualified with their home dir, the behavior
in the posted patch should be fine.
> FTPFileSystem.listStatus() runs very slow, due to inappropriate call of filePath.makeQualified
> ----------------------------------------------------------------------------------------------
>                 Key: HADOOP-9377
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9377
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.0.3-alpha
>            Reporter: James Yu
>         Attachments: HADOOP-9377.diff
> FTPFileSystem.listStatus() calls
> getFileStatus(ftpFiles[i], absolute) calls
> new FileStatus(....) calls 
> filePath.makeQualified(...) calls
> fs.getWorkingDirectory() calls
> getHomeDirectory()
> which creates new FTP connection every time, to get the workdingDirectory. this caused
the FTPFileSystem.listStatus() takes long time to run (on average 3-6 seconds per file in
my test).
> I attach a suggestion of fix in FTPFileSystem.java, only 4 lines of change. after the
fix, there's no slowness issue anymore.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message