hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
Date Thu, 03 Nov 2011 20:27:32 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143509#comment-13143509

Alejandro Abdelnur commented on HDFS-2316:

Thanks for the updated PDF with the API, looks good.

Following are the remaining issues:

1. Regarding FileStatus containing symlink & isSymlink elements. Got it, they should.
It would be enough to have symlink as an optional element, thus reducing the size of the response.

2. Regarding using 'username' parameter instead of 'user.name'. This comes from hadoop-auth
(Alfredo), it should be changed there not here.

3. Regarding querystring parameters/values case sensitive or no. IMO, as path is case sensitive,
querystring should be as well not to create confusion with developers/users.

4. Regarding filestatus containing localname instead full path to make payload smaller; it
makes sense. But shouldn't be just called 'name'?

5. Regarding filestatus, delete, rename, mkdirs, setreplication payloads and root element
being a classname. JSON does not require a root element, a JSON response can be an list of
key/value pairs (JSON object). I'd prefer to keep it like that. Specially for filestatus when
doing a liststatus operation, else they payload will increase significantly in size. Another
issue with the name of the class is that it should be an public class, not an implementation
one (currently is using 'HdfsFileStatus').

You mention that the root element class is added because of XML requiring a root element.
We are not spec-ing XML here. So I don't see this as a requirement. And if somebody is doing
JSON to XML they should account for that in the transcoding.

6. Regarding the scheme to use, "webhdfs://" and "http://". We are doing HTTP, this is why,
IMO, we should use "http://". For example, when using curl you'll use "http://" not "webhdfs://";
it will be less confusing to developers.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --------------------------------------------------------------------------
>                 Key: HDFS-2316
>                 URL: https://issues.apache.org/jira/browse/HDFS-2316
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI20111103.pdf
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a read-only FileSystem
and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem implementation
for accessing HDFS over HTTP.  The is the umbrella JIRA for the tasks.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message