hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
Date Sat, 05 Nov 2011 17:54:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144764#comment-13144764

Alejandro Abdelnur commented on HDFS-2316:


Now I got how you are proposing the json payload for filestatuses. You are correct, the overhead
is minimal.


Regarding #1, 'type' sounds good.

Regarding #2, ok.

Regarding #3, having params being case sensitive it does not mean they have to be all lowercase.
Hoop originally used case sensitive parameters using camelCase, thus the 'doas' parameter
was 'doAs'. How about going back to that for all parameter names and values. And for the 'op'
values it means they mimic the FileSystem method names (that was also the initial motivation
on Hoop).

Regarding #4, Having 'name' and 'localname' is not clear when you'll have one or the other.
If you have the full path it means the FileStatus is selfcontained and you don't need to know
the requested URL to know the file location in the filesystem and the payloads or filestatuses
are bigger. Having 'localname' is the other way around, you need to know the requested URL
to know the file location in the file system but the payloads of filestatuses will be bigger.
IMO we should choose one. I prefer the full path because it makes the filestatus selfcontained,
regarding the size of the payload, I wouldn't worry much about it as we are always talking
about the contents of a single directory. And we are using a verbose syntax afterwards. And
you could use compression in the server responses.

Regarding #5, My issue here is that having an extra nested level for a possible conversion
to XML. Is this a users requirement? If not I'd prefer to keep it without the class name.

Hoop can proxy any filesystem implementation. Because of this the HTTP REST API should be
restricted to the FileSystem public API; without exposing implementation specifics.

Regarding #6, I disagree, all this discussion we are having to have a single HTTP REST API
between Hoop and WebHDFS is to achieve interoperability between implementations and make it
transparent to users.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --------------------------------------------------------------------------
>                 Key: HDFS-2316
>                 URL: https://issues.apache.org/jira/browse/HDFS-2316
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI20111103.pdf
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a read-only FileSystem
and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem implementation
for accessing HDFS over HTTP.  The is the umbrella JIRA for the tasks.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message