hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8906) paths with multiple globs are unreliable
Date Thu, 11 Oct 2012 03:17:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473802#comment-13473802

Jason Lowe commented on HADOOP-8906:

Thanks for the updates, Daryn.  I'm eager to see the Jenkins results, although it seems the
Jenkins build is stuck right now.

After a closer look, I'm wondering if there's one more subtle difference between the old and
new versions, this time when the user specifies a filter.  I think the old version will return
null in the case of an non-globbed path that found a file but it didn't pass the specified
filter.  It applies the filter as it searches before it checks for the empty-array-should-return-null
case.  In the new version, it applies the specified filter *after* it checks for whether there
are matches, and in this scenario there will be a match since the filter hasn't been applied
yet.  I'm wondering if we should apply the filter before checking for an empty match array
to align with the old behavior.  Granted it's odd to provide a non-globbed path that wouldn't
pass the filter, so it's a corner-case.

Otherwise everything looks great, pending Jenkins since it found some things in the last run
that we missed.
> paths with multiple globs are unreliable
> ----------------------------------------
>                 Key: HADOOP-8906
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8906
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HADOOP-8906.patch, HADOOP-8906.patch, HADOOP-8906.patch, HADOOP-8906.patch
> Let's say we have have a structure of "$date/$user/stuff/file".  Multiple globs are unreliable
unless every directory in the structure exists.
> These work:
> date*/user
> date*/user/stuff
> date*/user/stuff/file
> These fail:
> date*/user/*
> date*/user/*/*
> date*/user/stu*
> date*/user/stu*/*
> date*/user/stu*/file
> date*/user/stuff/*
> date*/user/stuff/f*

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message