Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 81347186EA for ; Thu, 18 Jun 2015 00:37:02 +0000 (UTC) Received: (qmail 60771 invoked by uid 500); 18 Jun 2015 00:37:02 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 60712 invoked by uid 500); 18 Jun 2015 00:37:02 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 60701 invoked by uid 99); 18 Jun 2015 00:37:02 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Jun 2015 00:37:02 +0000 Date: Thu, 18 Jun 2015 00:37:02 +0000 (UTC) From: "Colin Patrick McCabe (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-10798) globStatus() does not return sorted list of files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590935#comment-14590935 ] Colin Patrick McCabe commented on HADOOP-10798: ----------------------------------------------- bq. I'd like to go ahead and remove the sorting language from the API. OK bq. There's no need to do the sort on shell-side since it's not being done already and hence there should be no behavior change. Disagree. "ls" on UNIX has always returned entries in sorted order. So does Hadoop's ls, except in the special case where you are using a non-HDFS filesystem. This is not a normal case, so the discrepancy got overlooked. But I think we should fix it now. We don't need to be ultra-fast in the shell, so there seems to be no reason why we shouldn't just add a sort. At least that's my thinking right now. What do you think? > globStatus() does not return sorted list of files > ------------------------------------------------- > > Key: HADOOP-10798 > URL: https://issues.apache.org/jira/browse/HADOOP-10798 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 2.3.0 > Reporter: Felix Borchers > Assignee: Colin Patrick McCabe > Priority: Minor > Labels: BB2015-05-TBR > Attachments: HADOOP-10798.001.patch > > > (FileSystem) globStatus() does not return a sorted file list anymore. > But the API says: " ... Results are sorted by their names." > Seems to be lost, when the Globber Object was introduced. Can't find a sort in actual code. > code to check this behavior: > {code} > Configuration conf = new Configuration(); > FileSystem fs = FileSystem.get(conf); > Path path = new Path("/tmp/" + System.currentTimeMillis()); > fs.mkdirs(path); > fs.deleteOnExit(path); > fs.createNewFile(new Path(path, "2")); > fs.createNewFile(new Path(path, "3")); > fs.createNewFile(new Path(path, "1")); > FileStatus[] status = fs.globStatus(new Path(path, "*")); > Collection list = new ArrayList(); > for (FileStatus f: status) { > list.add(f.getPath().toString()); > //System.out.println(f.getPath().toString()); > } > boolean sorted = Ordering.natural().isOrdered(list); > Assert.assertTrue(sorted); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)