hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Hadoop with case-preservation and case-insensitivity
Date Mon, 09 Mar 2009 13:48:27 GMT
Doug Cutting wrote:
> Paul Sheer wrote:
>> I have the requirement to use Hadoop with case-insensitivity and
>> case-preservation ala Windows.
> 
> I think you may have difficultly convincing folks that Hadoop should 
> directly support this mode of operation, and it's also a bad idea to run 
> a hacked version of HDFS, since that will be hard to maintain.
> 
> The safest and simplest way to support this might be to layer it on top 
> of the standard API.  You can implement a FilterFileSystem that, when 
> opening files or listing directories, uses case-insensitive comparisons. 
>  So, to open "/foo/bar" you'd first list "/" looking for subdirectories 
> which case-insensitively match "foo", then, if one is found, list it 
> looking for a file which case-insensitively matches "bar".  Could this 
> suffice?
> 
> Doug

full windows case-logic is pretty bizarre, as you need to ignore case 
all file operations ;mv lower LOWER would result in a file called 
"lower" because of the rule that if there is a destination file whose 
case-insensitive name matches that of the target file, it becomes the 
destination name.
Other issues:
- it should be impossible to create two files in the same directory with 
the same case-insensitive name.
- you need to take locale into account when comparing case. Turkey is 
the testcase, as "I".toLower()!="i"; it's the place where you get the 
bugreps when your logic is broken.

I would stay very clear of it.

Mime
View raw message