hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elek, Marton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15358) SFTPConnectionPool connections leakage
Date Fri, 23 Nov 2018 08:48:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696528#comment-16696528
] 

Elek, Marton commented on HADOOP-15358:
---------------------------------------

+1. It looks good to me.

Very precise problem definition and clean implementation with unit test. It's backward compatible
as the old methods are still there (which opens the new connections).

Unit test is passed + I tested it with 'dfs ls' and it worked well. I also noticed the recursive
behaviour with java debug.

Will commit it to the trunk soon... 

> SFTPConnectionPool connections leakage
> --------------------------------------
>
>                 Key: HADOOP-15358
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15358
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 3.0.0
>            Reporter: Mikhail Pryakhin
>            Assignee: Mikhail Pryakhin
>            Priority: Critical
>         Attachments: HADOOP-15358.001.patch
>
>
> Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some methods
of SFTPFileSystem are chained together resulting in establishing multiple connections to the
SFTP server to accomplish one compound action, those methods are listed below:
>  # mkdirs method
> the public mkdirs method acquires a new ChannelSftp from the pool [1]
> and then recursively creates directories, checking for the directory existence beforehand
by calling the method exists[2] which delegates to the getFileStatus(ChannelSftp channel,
Path file) method [3] and so on until it ends up in returning the FilesStatus instance [4].
The resource leakage occurs in the method getWorkingDirectory which calls the getHomeDirectory
method [5] which in turn establishes a new connection to the sftp server instead of using
an already created connection. As the mkdirs method is recursive this results in creating
a huge number of connections.
>  # open method [6]. This method returns an instance of FSDataInputStream which consumes
SFTPInputStream instance which doesn't return an acquired ChannelSftp instance back to the
pool but instead it closes it[7]. This leads to establishing another connection to an SFTP
server when the next method is called on the FileSystem instance.
> [1] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658
> [2] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321
> [3] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202
> [4] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290
> [5] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640
> [6] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504
> [7] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message