hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Radia (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2178) Contributing Hoop to HDFS, replacement for HDFS proxy with read/write capabilities
Date Fri, 21 Oct 2011 20:10:33 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133015#comment-13133015
] 

Sanjay Radia commented on HDFS-2178:
------------------------------------

Good example that forced me to really think about compatibility between webhdfs and proxy.
* I assume that the goal is :  a customer writes an application that uses our rest APIs to
read and write files AND that the application works in a transparent fashion for BOTH proxy
or webhdfs. Do you agree with this?
* 100-continue is a very good solution against the Http 1.1 spec in that the API works optimally
with webhdfs and proxy: it allows the proxy to NOT redirect and the NN to redirect. Unfortunately
it is not well supported by Jetty 6 (it is supported by Jetty 7, curl and httpclient).
* Arpit's Put solution: 
** Works really well with webhdfs with the limitation in jetty 6 and also in the future when
jetty supports 100 continue correctly. 
** However for proxy, the proxy ends up creating the file 2 times (The proxy should not redirect.
Hence it is not an infinite loop.)
* The getHandle solution has the unfortunate impact on the proxy that it always forces 2 operations
when 1 operation would have been enough. It is also a solution that we will deprecate when
jetty supports 100-continue.
                
> Contributing Hoop to HDFS, replacement for HDFS proxy with read/write capabilities
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-2178
>                 URL: https://issues.apache.org/jira/browse/HDFS-2178
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 0.23.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 0.23.0
>
>         Attachments: HDFS-2178.patch, HDFSoverHTTP-API.html, HdfsHttpAPI.pdf
>
>
> We'd like to contribute Hoop to Hadoop HDFS as a replacement (an improvement) for HDFS
Proxy.
> Hoop provides access to all Hadoop Distributed File System (HDFS) operations (read and
write) over HTTP/S.
> The Hoop server component is a REST HTTP gateway to HDFS supporting all file system operations.
It can be accessed using standard HTTP tools (i.e. curl and wget), HTTP libraries from different
programing languages (i.e. Perl, Java Script) as well as using the Hoop client. The Hoop server
component is a standard Java web-application and it has been implemented using Jersey (JAX-RS).
> The Hoop client component is an implementation of Hadoop FileSystem client that allows
using the familiar Hadoop filesystem API to access HDFS data through a Hoop server.
>   Repo: https://github.com/cloudera/hoop
>   Docs: http://cloudera.github.com/hoop
>   Blog: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
> Hoop is a Maven based project that depends on Hadoop HDFS and Alfredo (for Kerberos HTTP
SPNEGO authentication). 
> To make the integration easy, HDFS Mavenization (HDFS-2096) would have to be done first,
as well as the Alfredo contribution (HADOOP-7119).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message