hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7360) FsShell does not preserve relative paths with globs
Date Tue, 14 Jun 2011 22:10:47 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049467#comment-13049467
] 

Matt Foley commented on HADOOP-7360:
------------------------------------

TestPathData:

Multiple places: The preferred way to express "new Path(testDir+"/d1")" is "new Path(testDir,
"d1")".  This avoids hard-coding the path delimiter for a local filesystem path.

initialize():
* Consider "fs.createNewFile(Path)" instead of "fs.create(Path).close()"
* If you put the "setWorkingDirectory" command near the beginning of the method, wouldn't
all the creates and mkdirs get simpler?

I'm concerned that testDir is a relative path.  Can you convert it to an absolute path as
early in the process as possible, before anyone might have set a non-default working directory?

Why did you remove testWithFsAndPath()?  It is not a null test.

getParent() should be named testGetParent().

testRelativeGlobBack(): see previous concern about making "testDir" absolute before interacting
with second and later setWorkingDirectory() invocations.  This might give you testDir/testDir/d1
?

PathData:

Agree with your elimination of PathData(FileSystem, Path, FileStatus),
but suggest also eliminating PathData(FileSystem, String, FileStatus)
and PathData(FileSystem, FileStatus).
Basically, the status should be looked up in the FS, not set as an argument.
Should do so at the end of method PathData(FileSystem, String).
In comment for PathData(FileSystem, String), remove "Convenience" adjective.  It becomes the
primary ctor.

setStat(boolean ignoreFNF) is counter-intuitive.  Most people seeing "setStat(boolean)" will
assume the status is being set to the argument.  Please call it lookupStatus(boolean ignoreFNF).
 Or could use two methods, both with zero arguments, called setStat() and setStatIgnoreFNF().

Suggest letting lookupStatus(ignoreFNF) return the stat value instead of void.  Then refreshStatus
just becomes "return lookupStatus(false);".

getChecksumFile(): Why is this method needed (returns PathData), vs just using Fs.getChecksumFile()
(returns Path)?

getParent(): Several concerns:
* Why is this method needed instead of just using Path.getParent()?  There are currently no
callers of this new method.
* Why is it important to preserve the relative-ness of the parent path?  Trying to do so exposes
you to serious issues because the current working directory may have changed since PathData#string
was stored, which means the value returned by this method could be meaningless if string was
relative.  I think I would go so far as to say this can't work.  I would much rather let Path.getParent()
do what it was carefully implemented to do.
* The same issue applies to the next two chunks.

This brings up the issue of why PathData saves the value of #string?
This just begs for such mis-use.  I've opened HADOOP-7393 for this.


> FsShell does not preserve relative paths with globs
> ---------------------------------------------------
>
>                 Key: HADOOP-7360
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7360
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.23.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-7360.patch
>
>
> FsShell currently preserves relative paths that do not contain globs.  Unfortunately
the method {{fs.globStatus()}} is fully qualifying all returned paths.  This is causing inconsistent
display of paths.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message