hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@altiscale.com>
Subject Re: DISCUSS: is the order in FS.listStatus() required to be sorted?
Date Mon, 01 Jun 2015 15:29:13 GMT

	The POSIX spec for readdir (http://pubs.opengroup.org/onlinepubs/009695399/functions/readdir.html)
doesn’t spell out a sort order, so it should be assumed that the ordering isn’t guaranteed.

	Chris Siebenmann has written a few relative blog posts on the topic that might be of interest
here:

		* https://utcc.utoronto.ca/~cks/space/blog/unix/ReaddirHistory
		* https://utcc.utoronto.ca/~cks/space/blog/unix/ReaddirOrder

	So I think it’s OK to break the _API_ here ...

	** HOWEVER **

	POSIX ls (http://pubs.opengroup.org/onlinepubs/000095399/utilities/ls.html) DOES require
its output be sorted.  So breaking the sort order of 'hadoop fs -ls’ would be *extremely*
bad.  We need to make sure that doesn’t change.

On Jun 1, 2015, at 4:11 AM, Vinayakumar B <vinayakumarb@apache.org> wrote:

> I think the patch just updates the doc as of now, not changing any code to
> affect the existing usage.
> 
> Sorting depends on the underlying implementations.
> 
> Linux *ls *implementation returns alphanumerically sorted array by default
> ( Current implementation might have assumed from here to sort by default,
> just guessing ...) . But have some other options to sort on different
> attributes.
> 
> Java's *File.listFiles() *javadoc specifies as follows: *There is no
> guarantee that the name strings in the resulting array will appear in any
> specific order; they are not, in particular, guaranteed to appear in
> alphabetical order. *
> So the current change is inline with Java's FileSystem API atleast.
> 
> So IMO, its fine to commit the javadoc update.
> 
> -Vinay
> 
> On Mon, Jun 1, 2015 at 3:51 PM, Steve Loughran <stevel@hortonworks.com>
> wrote:
> 
>> 
>> HADOOP-12009 (https://issues.apache.org/jira/browse/HADOOP-12009) patches
>> the FS javadoc and contract tests to say "the order you get things back
>> from a listStatus() isn't guaranteed to be alphanumerically sorted"
>> 
>> That's one of those assumptions which we all have, but which, when you
>> think about it, doesn't have to be guaranteed.
>> 
>> I'm going to commit the patch with the updated docs. Before I do that,
>> does anyone have any objection -that is, is there some fundamental
>> constraint which requires it to come back sorted? Such as the FS APIs and
>> other apps which do expect that sorting, and which are going to break if
>> the rules change? If so, they may need to be looked at.
>> 
>> -Steve
>> 


Mime
View raw message