hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1563) Create FileSystem implementation to read HDFS data via http
Date Fri, 06 Jul 2007 22:39:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510818
] 

Doug Cutting commented on HADOOP-1563:
--------------------------------------

I think we should implement a servlet that:
1. Considers everything after the HttpServletRequest#getContextPath() as a path.
2. If it names an HDFS file, set attributes as HTTP headers and, if the request is HEAD return
an empty page, if GET, return the content, otherwise return an error.
3. If it's a HEAD or GET of a non-slash-terminated directory, redirect to the slash-terminated
directory.
4. If it's a HEAD or GET of a slash-terminated directory name, set attributes and, if GET,
return HTML containing links to that directory's files;
5. Otherwise return an error.

Then we should try to use this as a source for MapReduce and distcp and see how it fares.
 The HTTP client may need to be replaced, file status may need to be cached, etc.  But this
simple approach will get us up and going, and avoid investing too much time designing a schema,
parsing XML, etc. when that may not be required.

Thoughts?

> Create FileSystem implementation to read HDFS data via http
> -----------------------------------------------------------
>
>                 Key: HADOOP-1563
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1563
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Owen O'Malley
>            Assignee: Chris Douglas
>         Attachments: httpfs.patch
>
>
> There should be a FileSystem implementation that can read from a Namenode's http interface.
This would have a couple of useful abilities:
>   1. Copy using distcp between different versions of HDFS.
>   2. Use map/reduce inputs from a different version of HDFS. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message