hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jay vyas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-2025) Instantiating a FileSystem object should guarantee the existence of the working directory
Date Sun, 12 Oct 2014 02:28:33 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168475#comment-14168475
] 

jay vyas commented on HADOOP-2025:
----------------------------------

Thanks for this.... its  true... its not in the perview of the filesystem itself - Its really
up to the hadoop *solution* (i.e. the thing that a vendor is giving to a client), to solve
this problem, not the file system itself.
In the upstream we have *Bigtop*, which indeed aims to fill that gap in a free and open way
for the community, and which produces  idioms for others to follow around setting up and maintaining
a distributed hadoop based bigdata product.  To solve this problem, we have.
- a file system agnostic provisioner, https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/bigtop-utils/provision.groovy
and  
- as a universal json file format for defining the filesystems schema https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/init-hcfs.json
hopefully those artifacts can help people needing to solve this problem in a way that is FS
agnostic and maintainble.


> Instantiating a FileSystem object should guarantee the existence of the working directory
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2025
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2025
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.14.1
>            Reporter: Sameer Paranjpye
>            Assignee: Chris Douglas
>         Attachments: 2025-1.patch, 2025.patch
>
>
> Issues like HADOOP-1891 and HADOOP-1916 illustrate the need for this behavior.
> In HADOOP-1916 the problem is that the default working directory for a user on HDFS '/user/<username>'
does not exist. This results in the command 'hadoop dfs -copyFromLocal foo ." creating a *file*
called /user/<username> and copying the contents of the file 'foo' into this file.
> HADOOP-1891 is basically the same problem. The problem that Olga observed was that copying
a file to '.' on HDFS when her 'home directory' did not exist resulted in the creation of
a file with the path as her home directory. The problem is incorrectly filed as a bug in the
Path class. The behavior of Path is correct, as Doug points out, it is perfectly reasonable
for Path(".") to convert to an empty path. When this empty path is resolved in HDFS or any
other filesystem the resolution to '/user/<username>' is also correct (at least for
HDFS). The problem IMO is that the existence of the working directory is not guaranteed.
> When I log in to a machine my default working directory is '/home/sameerp' and filesystem
operations that I execute with relative paths all work correctly because this directory exists.
My home directory lives on a filer, in the event of it being unmountable the default working
directory I get is '/' which also is guaranteed to exist.
> In the context of Hadoop, instantiating a FileSystem object is the analogue of logging
in and should result in a working directory whose existence has been validated. In the case
of HDFS this should be '/user/<username>' or '/' if the directory does not exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message