Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8FC19D7EA for ; Thu, 11 Oct 2012 16:07:06 +0000 (UTC) Received: (qmail 32717 invoked by uid 500); 11 Oct 2012 16:07:06 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 32674 invoked by uid 500); 11 Oct 2012 16:07:06 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 32664 invoked by uid 99); 11 Oct 2012 16:07:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Oct 2012 16:07:06 +0000 Date: Thu, 11 Oct 2012 16:07:06 +0000 (UTC) From: "Jason Lowe (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1323048350.26909.1349971626177.JavaMail.jiratomcat@arcas> In-Reply-To: <235858495.16431.1349813043314.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HADOOP-8906) paths with multiple globs are unreliable MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474267#comment-13474267 ] Jason Lowe commented on HADOOP-8906: ------------------------------------ bq. In essence, perhaps a user filter means the query is always a glob? I can see it going either way. Yes, I thought about that as well. Maybe it would be more consistent to return empty instead of null in that case, but I was erring on the side of caution to maintain compatibility with the previous version's behavior. It all comes down to what a result of null really means. If it's being used to check for globs in the path then arguably we should continue to return null because someone could be using/abusing globStatus(path, falseFilter) to check for globs in a path even if the path exists in the filesystem. > paths with multiple globs are unreliable > ---------------------------------------- > > Key: HADOOP-8906 > URL: https://issues.apache.org/jira/browse/HADOOP-8906 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Priority: Critical > Attachments: HADOOP-8906-branch_0.23.patch, HADOOP-8906.patch, HADOOP-8906.patch, HADOOP-8906.patch, HADOOP-8906.patch, HADOOP-8906.patch > > > Let's say we have have a structure of "$date/$user/stuff/file". Multiple globs are unreliable unless every directory in the structure exists. > These work: > date*/user > date*/user/stuff > date*/user/stuff/file > These fail: > date*/user/* > date*/user/*/* > date*/user/stu* > date*/user/stu*/* > date*/user/stu*/file > date*/user/stuff/* > date*/user/stuff/f* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira