hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3246) FTP client over HDFS
Date Thu, 08 May 2008 17:30:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595335#action_12595335
] 

Doug Cutting commented on HADOOP-3246:
--------------------------------------

Overall this looks very good!  A few minor suggestions:
 - We should accept the username and password in the uri, as ftp://user:pass@host/, with the
password optional.
 - It would be nice if folks could specify different usernames and passwords for different
hosts in their configuration, perhaps with properties like ftp.user.host.example.com and ftp.password.host.example.com.
 - Rather than keeping a connection open in the FileSystem instance, perhaps we should open
and close new connections for each file read, written, renamed, etc?  FileSystem.java caches
FileSystem implementations forever, and an FTP connection might time out.  Also, the working-directory
state of the connection makes this not thread-safe, which a connection per request would fix.
 - It would be best if the unit tests ran standalone, without requiring an external FTP server.
 We might include the Mina FTP server just for testing?  We could put the jars somewhere in
src/test.

> FTP client over HDFS
> --------------------
>
>                 Key: HADOOP-3246
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3246
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: util
>    Affects Versions: 0.16.3
>            Reporter: Ankur
>            Priority: Minor
>         Attachments: ftpFileSystem.patch
>
>
> An FTP client that stores content directly into HDFS allows data from FTP serves to be
stored directly into HDFS instead of first copying the data locally and then uploading it
into HDFS. The benefits are apparent from an administrative perspective as large datasets
can be pulled from FTP servers with minimal human intervention.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message