Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-dev@hadoop.apache.org
Message-ID: <1026400864.1237398590878.JavaMail.jira@brutus>
Date: Wed, 18 Mar 2009 10:49:50 -0700 (PDT)
From: =?utf-8?Q?Johan_Lies=C3=A9n_=28JIRA=29?= <jira@apache.org>
To: core-dev@hadoop.apache.org
Subject: [jira] Commented: (HADOOP-4952) Improved files system interface for
 the application writer.
In-Reply-To: <1117011164.1230584204197.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/HADOOP-4952?page=3Dcom.atlassia=
n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D126=
83107#action_12683107 ]=20

Johan Lies=C3=A9n commented on HADOOP-4952:
--------------------------------------

I do not fully grok the problem here. Is the meaning of a working directory=
 different in HDFS from that in the local file system?=20

I've just briefly browsed through the NIO.2 FS API and, as far as I can tel=
l, the notion of working directory doesn't really exist. It is only applica=
ble to file systems where the environment (working directory, or similar) i=
s implicitly available (or if there is none). In my opinion, this is a good=
 thing.=20

In this respect: Paths.get("hdfs://...") doesn't make much sense because th=
e environment for HDFS, the Configuration object, is not available nor is a=
 FileSystem reference.

Wrt. 3. I'm not sure I understand the meaning of "slash relative names"; is=
n't that an absolute path?

> Improved files system interface for the application writer.
> -----------------------------------------------------------
>
>                 Key: HADOOP-4952
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4952
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.21.0
>            Reporter: Sanjay Radia
>            Assignee: Sanjay Radia
>         Attachments: Files.java
>
>
> Currently the FIleSystem interface serves two purposes:
> - an application writer's interface for using the Hadoop file system
> - a file system implementer's interface (e.g. hdfs, local file system, kf=
s, etc)
> This Jira proposes that we provide a simpler interfaces for the applicati=
on writer and leave the FilsSystem  interface for the implementer of a file=
system.
> - Filesystem interface  has a  confusing set of methods for the applicati=
on writer
> - We could make it easier to take advantage of the URI file naming
> ** Current approach is to get FileSystem instance by supplying the URI an=
d then access that name space. It is consistent for the FileSystem instance=
 to not accept URIs for other schemes, but we can do better.
> ** The special copyFromLocalFIle can be generalized as a  copyFile where =
the src or target can be generalized to any URI, including the local one.
> ** The proposed scheme (below) simplifies this.
> -=09The client side config can be simplified.=20
> ** New config() by default uses the default config. Since this is the com=
mon usage pattern, one should not need to always pass the config as a param=
eter when accessing the file system. =20
> -=09
> ** It does not handle multiple file systems too well. Today a site.xml is=
 derived from a single Hadoop cluster. This does not make sense for multipl=
e Hadoop clusters which may have different defaults.
> ** Further one should need very little to configure the client side:
> *** Default files system.
> *** Block size=20
> *** Replication factor
> *** Scheme to class mapping
> ** It should be possible to take Blocksize and replication factors defaul=
ts from the target file system, rather then the client size config.  I am n=
ot suggesting we don't allow setting client side defaults, but most clients=
 do not care and would find it simpler to take the defaults for their syste=
ms  from the target file system.=20

--=20
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.