hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10798) globStatus() does not return sorted list of files
Date Wed, 20 May 2015 06:32:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551900#comment-14551900
] 

Rui Li commented on HADOOP-10798:
---------------------------------

Hi guys, does parallel sorting rely on the returned files being sorted? If the sorted results
are in multiple files, we need to read the files in specific order to maintain the global
sort right?

> globStatus() does not return sorted list of files
> -------------------------------------------------
>
>                 Key: HADOOP-10798
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10798
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.3.0
>            Reporter: Felix Borchers
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>              Labels: BB2015-05-TBR
>         Attachments: HADOOP-10798.001.patch
>
>
> (FileSystem) globStatus() does not return a sorted file list anymore.
> But the API says: " ... Results are sorted by their names."
> Seems to be lost, when the Globber Object was introduced. Can't find a sort in actual
code.
> code to check this behavior:
> {code}
>         Configuration conf = new Configuration();
>         FileSystem fs = FileSystem.get(conf);
>         Path path = new Path("/tmp/" + System.currentTimeMillis());
>         fs.mkdirs(path);
>         fs.deleteOnExit(path);
>         fs.createNewFile(new Path(path, "2"));
>         fs.createNewFile(new Path(path, "3"));
>         fs.createNewFile(new Path(path, "1"));
>         FileStatus[] status = fs.globStatus(new Path(path, "*"));
>         Collection list = new ArrayList();
>         for (FileStatus f: status) {
>             list.add(f.getPath().toString());
>             //System.out.println(f.getPath().toString());
>         }
>         boolean sorted = Ordering.natural().isOrdered(list);
>         Assert.assertTrue(sorted);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message