hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8906) paths with multiple globs are unreliable
Date Wed, 10 Oct 2012 20:43:03 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473542#comment-13473542
] 

Jason Lowe commented on HADOOP-8906:
------------------------------------

The changes look good, but I think there's a lingering bug wrt. how GlobExpander's results
are handled.  For example, I would expect the following test to pass:
{noformat}
      assertEquals(0, fs.globStatus(new Path("/{nonexistent1/a,nonexistent2/b}")).length);
{noformat}
However it fails because globStatus is returning null instead of an empty array.  FsShell
relies on the fact that globStatus will never return null for paths that contain pattern characters,
and we could end up creating paths with pattern characters.  I checked trunk, and it turns
out this test crashes globStatus internally.  Not crashing is nice, but we shouldn't be reporting
null for paths with pattern characters.  Fortunately the fix is pretty easy.  We just need
to check if GlobExpander returned more than one result which implies patterns were used and
we should convert a result of null into an empty array in that case.
                
> paths with multiple globs are unreliable
> ----------------------------------------
>
>                 Key: HADOOP-8906
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8906
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HADOOP-8906.patch, HADOOP-8906.patch
>
>
> Let's say we have have a structure of "$date/$user/stuff/file".  Multiple globs are unreliable
unless every directory in the structure exists.
> These work:
> date*/user
> date*/user/stuff
> date*/user/stuff/file
> These fail:
> date*/user/*
> date*/user/*/*
> date*/user/stu*
> date*/user/stu*/*
> date*/user/stu*/file
> date*/user/stuff/*
> date*/user/stuff/f*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message