commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel F. Savarese" <...@savarese.org>
Subject Re: [vfs] FTP extremely slow compared to SFTP
Date Thu, 13 Dec 2007 16:40:04 GMT

In message <F0D7281DAB048B438E8F5EC4ECEFBDDC027A2808@esmail.elsag.de>, =?iso-88
59-1?Q?J=F6rg_Schaible?= writes:
>Well, it is a bit more complex. Some files in the target directory are =
>opened and read again, but this does not explain the numbers below. The =
>only difference is the initialization of the FileSystem with a SFTP URL =
>instead of one with FTP. Since SFTP adds overhead on top of FTP the only =
>conclusion is, that the FTP implementation in use is horrible slow.

I haven't been following this thread, so I may have missed some
important piece of information.  However, this statement does not
make sense.  For any network file transfer, the network I/O time
dominates.  SFTP compresses file data (although I don't know about the
version used by VFS) and FTP does not.  Therefore, for average files
(i.e., not already highly compressed) SFTP should produce much better
throughput than FTP (anywhere from a factor of 2 to 10 better depending
on the nature of the files).

Commons Net has been in use in one form or another for about 10
years now.  If FTP transfers were "horrible slow," it would have
been detected long ago.  Copying bits in user-space doesn't cost all
that much more in Java than in C and my measurements show that
FTPClient still delivers about the same throughput as command-line FTP
for large files today as it did 10 years ago.  A problem that eventually
arose with using storeFile instead of storeFileStream was that the
copy buffer used became too small for newer TCP/IP stacks, starving
the kernel I/O path for work and resulting on excessive user-space
copies, so a method was added to adjust the buffer size.  As I
mentioned in another thread, the ideal thing would be to use the Java
equivalent of sendfile, but that wouldn't help you.

After examining the numbers you provided, I actually don't understand
why you're looking for a problem in FTPClient if your data shows FTP
transfers are actually faster with FTP for large files.  You reported
FTP:
>> * Single File (25000kb) 
>>  - Exporter.............................................. 11437.0 ms
>> * Single File (50000kb) 
>>  - Exporter.............................................. 16043.0 ms
SFTP:
>> Single File (25000kb) 
>>  - Exporter.............................................. 14952.0 ms
>> * Single File (50000kb) 
>>  - Exporter.............................................. 24885.0 ms

Clearly FTP is not "horrible slow" in comparison to SFTP.

In order to get results such as:
FTP:
>> * Single Directory (with 50 files of 1kb)
>>  - Exporter.............................................. 224312.0 ms
SFTP:
>> * Single Directory (with 50 files of 1kb)
>>  - Exporter.............................................. 1552.0 ms

either FTPClient is not being used efficiently by the calling code
or the overhead of the establishment of each FTP DATA connection is adding up
(in which case command-line FTP would produce the same results and what
you're seeing is SFTP benefitting from establishing a single TCP connection
and FTP requiring a new connection for each transfer).  I'd suggest
investigating those possibilities in more depth.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message