hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3307) Archives in Hadoop.
Date Thu, 24 Apr 2008 21:53:21 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592177#action_12592177

Doug Cutting commented on HADOOP-3307:

Note that har:<path>! is a non-hierarchical, opaque URI.  Much of the Path code assumes
that URIs are hierarchical and would need to be altered to support opaque uris.

One alternative would be to always "mount" hars before access.  Mounting would just require
setting a "fs.har.<name>" property to a har file.

For example, a job could add a mount with:

job.set("fs.har.myfiles", "hdfs://host:port/dir/my.har");

Then specify its input as:


Another alternative could be to somehow escape paths in the authority of har: uris, e.g.:


Where -c and -s are escapes for colon and slash.  Then the uris could still be hierarchical.
 The downside is that paths would look really ugly.  Sigh.

If we wanted to make it transparent, then we might do it by adding symbolic links to the FileSystem
API, rather than hacking DFSClient.  Then one could "mount" a har file by simply linking to
a har: URI.

> Archives in Hadoop.
> -------------------
>                 Key: HADOOP-3307
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3307
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 0.18.0
> This is a new feature for archiving and unarchiving files in HDFS. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message